[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060865#comment-17060865 ] Kurt Young commented on FLINK-16627: I see, i misunderstood this issue. If we delete all keys with null value in the json, would Flink be able to read the json again in downstream source? > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060863#comment-17060863 ] jackray wang commented on FLINK-16627: -- I suggest adding a parameter when defining a sinktable like “format.removenull = true” ,if this parameter is true ,remove all the keys which value are nulls,and this parameter default value set to false ,do nothing. Compatible with previous scripts, add parameters if necessary. > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060858#comment-17060858 ] Benchao Li commented on FLINK-16627: [~jark] we name it 'format.filter-null-values' in our internal branch. [~jackray] Would you like to fix this issue for the community? > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060852#comment-17060852 ] Jark Wu commented on FLINK-16627: - I'm fine to have this property. What do you think [~ykt836]? Hi [~libenchao], [~jackray], do you have any idea about what the property to expose? > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060829#comment-17060829 ] jackray wang commented on FLINK-16627: -- [~jark] [~ykt836] Recently, our company also needs to use Flink a lot, and I can do some things for flink if necessary > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060789#comment-17060789 ] Benchao Li commented on FLINK-16627: We have this need in our company too. And I supported it by add an optional property, which filters null cols from the result json. If we agrees that this is a valid feature, I can contribute it to the community. > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060665#comment-17060665 ] jackray wang commented on FLINK-16627: -- if i add a where like "where svt is not null" ,i can't get the row data,but i need the subtype even if the value of svt is null > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060661#comment-17060661 ] Jark Wu commented on FLINK-16627: - I think user still need the record even if svt is null, because other columns are not null. > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060651#comment-17060651 ] Kurt Young commented on FLINK-16627: BTW, such issue should be reported to user mailing thread, I will close this for now. > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060650#comment-17060650 ] Kurt Young commented on FLINK-16627: why don't add a "where svt is not null" before inserting into kafka? > when insert into kafkas ,how can i remove the keys with null values of json > --- > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: jackray wang >Priority: Major > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)