[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread Kurt Young (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060865#comment-17060865
 ] 

Kurt Young commented on FLINK-16627:


I see, i misunderstood this issue. If we delete all keys with null value in the 
json, would Flink be able to read the json again in downstream source?

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread jackray wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060863#comment-17060863
 ] 

jackray wang commented on FLINK-16627:
--

I suggest adding a parameter when defining a sinktable like “format.removenull 
= true”  ,if this parameter is true ,remove all the keys which value are 
nulls,and this parameter default value set to false ,do nothing.    Compatible 
with previous scripts, add parameters if necessary.

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread Benchao Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060858#comment-17060858
 ] 

Benchao Li commented on FLINK-16627:


[~jark] we name it 'format.filter-null-values' in our internal branch. 

[~jackray] Would you like to fix this issue for the community?

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060852#comment-17060852
 ] 

Jark Wu commented on FLINK-16627:
-

I'm fine to have this property. What do you think [~ykt836]?

Hi [~libenchao], [~jackray], do you have any idea about what the property to 
expose? 

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread jackray wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060829#comment-17060829
 ] 

jackray wang commented on FLINK-16627:
--

[~jark]  [~ykt836] Recently, our company also needs to use Flink a lot, and I 
can do some things for flink if necessary

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread Benchao Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060789#comment-17060789
 ] 

Benchao Li commented on FLINK-16627:


We have this need in our company too. And I supported it by add an optional 
property, which filters null cols from the result json. If we agrees that this 
is a valid feature, I can contribute it to the community.

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread jackray wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060665#comment-17060665
 ] 

jackray wang commented on FLINK-16627:
--

if i add a where like "where svt is not null" ,i can't get the row data,but i 
need the subtype even if the value of svt is null

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-17 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060661#comment-17060661
 ] 

Jark Wu commented on FLINK-16627:
-

I think user still need the record even if svt is null, because other columns 
are not null. 

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-16 Thread Kurt Young (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060651#comment-17060651
 ] 

Kurt Young commented on FLINK-16627:


BTW, such issue should be reported to user mailing thread, I will close this 
for now.

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16627) when insert into kafkas ,how can i remove the keys with null values of json

2020-03-16 Thread Kurt Young (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060650#comment-17060650
 ] 

Kurt Young commented on FLINK-16627:


why don't add a "where svt is not null" before inserting into kafka?

> when insert into kafkas ,how can i remove the keys with null values of json
> ---
>
> Key: FLINK-16627
> URL: https://issues.apache.org/jira/browse/FLINK-16627
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: jackray wang
>Priority: Major
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {
> def eval(str: String) : String= { 
>if(str == null){
>return ""
>}else{
>return str
>}
> }
> 
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> 
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)