[ 
https://issues.apache.org/jira/browse/HIVE-9469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529157#comment-14529157
 ] 

Vaibhav Gumashta commented on HIVE-9469:
----------------------------------------

[~manish.hadoop.w...@gmail.com] Sorry about the late response. Yes, makes sense 
to set the datanuclues config to allow more db connections. I would set the 
size based on how big is the thrift worker pool. I don't think 10 db 
connections are good enough for 500 worker threads.

> Hive Thrift Server throws Socket Timeout Exception: Read time out
> -----------------------------------------------------------------
>
>                 Key: HIVE-9469
>                 URL: https://issues.apache.org/jira/browse/HIVE-9469
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.10.0
>         Environment: 4 core cpu, 15gb memory. 2 thrift server behind load 
> balancer
>            Reporter: Manish Malhotra
>         Attachments: After_JMV_Profiling_Tuning.jpg, 
> Before_JMV_Profiling_Tuning.jpg
>
>
> Hi All,
> Please review the following problem, I also posted same in the hive-user 
> group, but didnt got any response yet. 
> This is happening quite frequently in our environment. 
> So, it would be great if somebody can see and advise. 
> I'm using Hive Thrift Server in Production which at peak handles around 500 
> req/min.
> After certain point the Hive Thrift Server is going into the no response mode 
> and throws 
> Following exception 
> "org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out" 
> As the metastore we are using MySQL, that is being used by Thrift server. 
> The design / architecture is like this: 
> Oozie -- > Hive Action --> ELB (AWS) --> Hive Thrift ( 2 servers) --> MySQL 
> (Master) -- > MySQL (Slave).
> Software versions: 
>    Hive version : 0.10.0
>    Hadoop: 1.2.1
> Looks like when the load is beyond some threshold for certain operations it 
> is having problem in responding. 
> As the hive jobs sometimes fails because of this issue, we also have a 
> auto-restart check to see if the Thrift server is not responding, it stops / 
> kills and restart the service. 
> Other tuning done: 
> Thrift Server: 
> Given 11gb heap, and configured CMS GC algo. 
> MySQL: 
> Tuned innodb_buffer, tmp_table and max_heap parameters.
> So, can somebody please help to understand, what could be the root cause for 
> this or somebody faced the similar issue. 
> I found one related JIRA :https://issues.apache.org/jira/browse/HCATALOG-541
> But this JIRA shows that Hive Thrift Server shows OOM error, but in my case I 
> didnt see any OOM error in my case.
> Regards,
> Manish
> Full Exception Stack: 
>     at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>     at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>     at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>     at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>     at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
>     at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
>     at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:601)
>     at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
>     at $Proxy7.getDatabase(Unknown Source)
>     at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1110)
>     at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>     at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
>     at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
>     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
>     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
>     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
>     at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706)
>     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:601)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> Caused by: java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.read(SocketInputStream.java:150)
>     at java.net.SocketInputStream.read(SocketInputStream.java:121)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>     at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>     at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
>     ... 34 more
> 2015-01-20 22:44:12,978 ERROR exec.Task (SessionState.java:printError(401)) - 
> FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
>     at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1114)
>     at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>     at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
>     at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
>     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to