[jira] [Updated] (IMPALA-10342) Alleviating congestion caused by row-level warnings

Fifteen (Jira) Thu, 19 Nov 2020 04:51:03 -0800


     [ 
https://issues.apache.org/jira/browse/IMPALA-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Fifteen updated IMPALA-10342:
-----------------------------
    Description: 
By default, when encounting error, both `get_json_object()` and 
`DecimalOperators::IntToDecimalVal` will raise warning. During to their 
stateless nature, functions keep throwing messages. Hence the warning flood 
will easily overwhelm cluster's processing capacity.

To be specific, we have observed these bottlenecks:

*Exchange Receiver*:   the default value for `rpc_max_message_size` is 50MB. 
The flooding warning messages carried by ReportExecStatusPB will exceed that 
limit, causing report without profile.  Even though the report message size is 
less than those limit, the bandwidth consumption is non-trivial.

*Storage:* like https://issues.apache.org/jira/browse/IMPALA-5256 , warning 
messages produces huge log files since `stdout/stderr` won't be redirected when 
glog is rolling log files. 

*Coordinator*: runtime profiles will be seriialized to thrift and stored in 
Coordinator's memory. The warning flood will make `Untracked Memory` rising 
rapidly. I have made a heap profile(with pprof) and found most memory were used 
by RuntimeProfile and Strings. 

!image-2020-11-19-17-30-22-918.png!

 

*Imperfect Solution:*

We suffered a lot from this problem, and we have came out with an Imperfect 
solution. 
 # We have a straightforward solution by muting the AddWarning()
 # Introduced a query option to re-enable the warning when needed.

 

We are looking forward for a better solution from community discussions. 

 

  was:
By default, when encounting error, both `get_json_object()` and 
`DecimalOperators::IntToDecimalVal` will raise warning. During to their 
stateless nature, functions keep throwing messages. Hence the warning flood 
will easily overwhelm cluster's processing capacity.

To be specific, we have observed these bottlenecks:

*Exchange Receiver*:   the default value for `rpc_max_message_size` is 50MB. 
The flooding warning messages carried by ReportExecStatusPB will exceed that 
limit, causing report without profile.  Even though the report message size is 
less than those limit, the bandwidth consumption is non-trivial.

*Storage:* like https://issues.apache.org/jira/browse/IMPALA-5256 , warning 
messages produces huge log files since `stdout/stderr` won't be redirected when 
glog is rolling log files. 

*Coordinator*: runtime profiles will be seriialized to thrift and stored in 
Coordinator's memory. The warning flood will make `Untracked Memory` rising 
rapidly. I have made a mem sample and found most memory were used by 
RuntimeProfile and Strings. 

!image-2020-11-19-17-30-22-918.png!

 

*Imperfect Solution:*

We suffered a lot from this problem, and we have came out with an Imperfect 
solution. 
 # We have a straightforward solution by muting the AddWarning()
 # Introduced a query option to re-enable the warning when needed.

 

We are looking forward for a better solution from community discussions. 

 


> Alleviating congestion caused by row-level warnings 
> ----------------------------------------------------
>
>                 Key: IMPALA-10342
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10342
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Fifteen
>            Priority: Major
>         Attachments: image-2020-11-19-17-30-22-918.png, 
> impalad-ram-profile.pdf
>
>
> By default, when encounting error, both `get_json_object()` and 
> `DecimalOperators::IntToDecimalVal` will raise warning. During to their 
> stateless nature, functions keep throwing messages. Hence the warning flood 
> will easily overwhelm cluster's processing capacity.
> To be specific, we have observed these bottlenecks:
> *Exchange Receiver*:   the default value for `rpc_max_message_size` is 50MB. 
> The flooding warning messages carried by ReportExecStatusPB will exceed that 
> limit, causing report without profile.  Even though the report message size 
> is less than those limit, the bandwidth consumption is non-trivial.
> *Storage:* like https://issues.apache.org/jira/browse/IMPALA-5256 , warning 
> messages produces huge log files since `stdout/stderr` won't be redirected 
> when glog is rolling log files. 
> *Coordinator*: runtime profiles will be seriialized to thrift and stored in 
> Coordinator's memory. The warning flood will make `Untracked Memory` rising 
> rapidly. I have made a heap profile(with pprof) and found most memory were 
> used by RuntimeProfile and Strings. 
> !image-2020-11-19-17-30-22-918.png!
>  
> *Imperfect Solution:*
> We suffered a lot from this problem, and we have came out with an Imperfect 
> solution. 
>  # We have a straightforward solution by muting the AddWarning()
>  # Introduced a query option to re-enable the warning when needed.
>  
> We are looking forward for a better solution from community discussions. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (IMPALA-10342) Alleviating congestion caused by row-level warnings

Reply via email to