[
https://issues.apache.org/jira/browse/DRILL-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493288#comment-14493288
]
Alexander Zarei commented on DRILL-2767:
----------------------------------------
[~jnadeau] So I ran to the same problem again. I went to Hive shell and
executed the queries and it worked fine. In fact, as you said it is related to
Hive, but to Hive metastore. So, what happens Hive metastore stops working
after a while and causes the Hive storage plugin to stop working. I do not
actually know why but I kinda have a guess. When I start the metastore as "hive
--service meta store" the Hive storage plugin starts working and I can see the
schemas and Hive tables on Drill Explorer but I cannot query the tables.
My guessing for metastore stopping is that, in fact, all my ssh-sessions
disconnect if I do not run a command for a good while. I start Hive service in
two ways, either with "hive --service metastore" which traps the command prompt
and the connection will die after maybe 1 hour or with "hive --service
metastore &" which returns the command prompt and I can keep running commands
but still the metastore stops after a while.
Any suggestions?
Thanks
> Fragment error on TPCH Scale Factor 30 on a query that completed successfully
> previously
> ----------------------------------------------------------------------------------------
>
> Key: DRILL-2767
> URL: https://issues.apache.org/jira/browse/DRILL-2767
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Hive
> Affects Versions: 0.8.0
> Environment: AWS EMR cluster of three m1.xlarge nodes
> Reporter: Alexander Zarei
> Assignee: Venki Korukanti
> Attachments: drillbitcore1.log, drillbitcore1.out, drillbitcore2.log,
> drillbitcore2.out, drillbitmaster.out, lineitem table schema .png
>
>
> The following sequence led to the error:
> Executed the query
> bq. SELECT * FROM `realhive`.`tpch_text_30`.`lineitem`
> and it took about 43 minutes to execute successfully.
> After ward I ran the query
> bq. SELECT * FROM `realhive`.`tpch_text_2`.`lineitem`
> for 6 times to find an optimization value for the ODBC driver.
> Afterward, I submitted the first query again
> bq. SELECT * FROM `realhive`.`tpch_text_30`.`lineitem`
>
> and the Drill Cluster returned a fragment error.
> bq. ***[HY000]: [MapR][Drill] (1040) Drill failed to execute the query:
> SELECT * FROM `realhive`.`tpch_text_30`.`lineitem`[30024]Query execution
> error. Details:[RemoteRpcException: Failure while running fragment.[
> fb97e7be-d09e-46fe-8728-9577fd0d8795 on ip-10-12-62-65
> Log files with debug level for the Drillbits on the master node as well as
> the core nodes of the cluster are attached.
> Also the connection through the ODBC driver on Linux 32 bit was "Direct" to
> the drillbit on the master node of the Hadoop cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)