[ 
https://issues.apache.org/jira/browse/IMPALA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-13020.
------------------------------------
    Fix Version/s: Impala 4.5.0
       Resolution: Fixed

> catalog-topic updates >2GB do not work due to Thrift's max message size
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-13020
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13020
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.2.0, Impala 4.3.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Critical
>             Fix For: Impala 4.5.0
>
>
> Thrift 0.16.0 added a max message size to protect against malicious packets 
> that can consume a large amount of memory on the receiver side. This max 
> message size is a signed 32-bit integer, so it maxes out at 2GB (which we set 
> via thrift_rpc_max_message_size).
> In catalog v1, the catalog-update statestore topic can become larger than 2GB 
> when there are a large number of tables / partitions / files. If this happens 
> and an Impala coordinator needs to start up (or needs a full topic update for 
> any other reason), it is expecting the statestore to send it the full topic 
> update, but the coordinator actually can't process the message. The 
> deserialization of the message hits the 2GB max message size limit and fails.
> On the statestore side, it shows this message:
> {noformat}
> I0418 16:54:51.727290 3844140 statestore.cc:507] Preparing initial 
> catalog-update topic update for 
> impa...@mcdonnellthrift.vpc.cloudera.com:27000. Size = 2.27 GB
> I0418 16:54:53.889446 3844140 thrift-util.cc:198] TSocket::write_partial() 
> send() <Host: mcdonnellthrift.vpc.cloudera.com Port: 23000>: Broken pipe
> I0418 16:54:53.889488 3844140 client-cache.cc:82] ReopenClient(): re-creating 
> client for mcdonnellthrift.vpc.cloudera.com:23000
> I0418 16:54:53.889493 3844140 thrift-util.cc:198] TSocket::write_partial() 
> send() <Host: mcdonnellthrift.vpc.cloudera.com Port: 23000>: Broken pipe
> I0418 16:54:53.889503 3844140 thrift-client.cc:116] Error closing connection 
> to: mcdonnellthrift.vpc.cloudera.com:23000, ignoring (write() send(): Broken 
> pipe)
> I0418 16:54:56.052882 3844140 thrift-util.cc:198] TSocket::write_partial() 
> send() <Host: mcdonnellthrift.vpc.cloudera.com Port: 23000>: Broken pipe
> I0418 16:54:56.052932 3844140 client-cache.h:363] RPC Error: Client for 
> mcdonnellthrift.vpc.cloudera.com:23000 hit an unexpected exception: write() 
> send(): Broken pipe, type: N6apache6thrift9transport19TTransportExceptionE, 
> rpc: N6impala20TUpdateStateResponseE, send: not done
> I0418 16:54:56.052937 3844140 client-cache.cc:174] Broken Connection, destroy 
> client for mcdonnellthrift.vpc.cloudera.com:23000{noformat}
> On the Impala side, it doesn't give a good error, but we see this:
> {noformat}
> I0418 16:54:53.889683 3214537 TAcceptQueueServer.cpp:355] New connection to 
> server StatestoreSubscriber from client <Host: 127.0.0.1 Port: 49632>
> I0418 16:54:54.080694 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 110
> I0418 16:54:56.080920 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 111
> I0418 16:54:58.081131 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 112
> I0418 16:55:00.081358 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 113{noformat}
> With a patch Thrift that allows an int64_t max message size and setting that 
> to a larger value, the Impala was able to start up (even without restarting 
> the statestored).
> Some clusters that upgrade to a newer version may hit this, as Thrift didn't 
> use to enforce this limit, so this is something we should fix to avoid 
> upgrade issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to