[
https://issues.apache.org/jira/browse/HIVE-10837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096061#comment-15096061
]
Bohumir Zamecnik commented on HIVE-10837:
-----------------------------------------
I've came across this issue on HiveServer2 1.1.1 used via Beeline on CDH
CDH-5.4.4-1.cdh5.4.4.p0.4. The problem wasn't limited to inserting, a plain
select failed as well. I queried a quite a big partitioned tabled backed by
SequenceFiles of Protobuf. Each partition has about 6B records and it around
700GB. Query on a single partition was ok, but querying eg. 30 partitions fails
(~19TB). Note that the same query executed via Hive CLI 1.1.1 works ok. The
resulting number of rows is really small (the number of partitions, eg. <= 31).
The HQL query string itself is small.
> Running large queries (inserts) fails and crashes hiveserver2
> -------------------------------------------------------------
>
> Key: HIVE-10837
> URL: https://issues.apache.org/jira/browse/HIVE-10837
> Project: Hive
> Issue Type: Bug
> Environment: Hive 1.1.0 on RHEL with Cloudera (cdh5.4.0)
> Reporter: Patrick McAnneny
> Priority: Critical
>
> When running a large insert statement through beeline or pyhs2, a thrift
> error is returned and hiveserver2 crashes.
> I ran into this with large insert statements -- my initial failing query was
> around 6million characters. After further testing however it seems like the
> failure threshold is based on number of inserted rows rather than the query's
> size in characters. My testing shows the failure threshold between 199,000
> and 230,000 inserted rows.
> The thrift error is as follows:
> Error: org.apache.thrift.transport.TTransportException:
> java.net.SocketException: Broken pipe (state=08S01,code=0)
> Also note for anyone that tests this issue - when testing different queries I
> ran into https://issues.apache.org/jira/browse/HIVE-10836
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)