[jira] [Commented] (DRILL-2767) Fragment error on TPCH Scale Factor 30 on a query that completed successfully previously

Alexander Zarei (JIRA) Mon, 13 Apr 2015 16:40:42 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493288#comment-14493288
 ]


Alexander Zarei commented on DRILL-2767:
----------------------------------------

[~jnadeau] So I ran to the same problem again. I went to Hive shell and 
executed the queries and it worked fine. In fact, as you said it is related to 
Hive, but to Hive metastore. So, what happens Hive metastore stops working  
after a while and causes the Hive storage plugin to stop working. I do not 
actually know why but I kinda have a guess. When I start the metastore as "hive 
--service meta store" the Hive storage plugin starts working and I can see the 
schemas and Hive tables on Drill Explorer but I cannot query the tables.

My guessing for metastore stopping is that, in fact, all my ssh-sessions 
disconnect if I do not run a command for a good while. I start Hive service in 
two ways, either with "hive --service metastore" which traps the command prompt 
and the connection will die after maybe 1 hour or with "hive --service 
metastore &" which returns the command prompt and I can keep running commands 
but still the metastore stops after a while.

Any suggestions? 

Thanks

> Fragment error on TPCH Scale Factor 30 on a query that completed successfully 
> previously
> ----------------------------------------------------------------------------------------
>
>                 Key: DRILL-2767
>                 URL: https://issues.apache.org/jira/browse/DRILL-2767
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive
>    Affects Versions: 0.8.0
>         Environment: AWS EMR cluster of three m1.xlarge nodes
>            Reporter: Alexander Zarei
>            Assignee: Venki Korukanti
>         Attachments: drillbitcore1.log, drillbitcore1.out, drillbitcore2.log, 
> drillbitcore2.out, drillbitmaster.out, lineitem table schema .png
>
>
> The following sequence led to the error:
> Executed the query 
> bq. SELECT * FROM `realhive`.`tpch_text_30`.`lineitem`
> and it took about 43 minutes to execute successfully. 
> After ward I ran the query 
> bq. SELECT * FROM `realhive`.`tpch_text_2`.`lineitem`
> for 6 times to find an optimization value for the ODBC driver. 
> Afterward, I submitted the first query again
> bq. SELECT * FROM `realhive`.`tpch_text_30`.`lineitem`
>  
> and the Drill Cluster returned a fragment error.
> bq. ***[HY000]: [MapR][Drill] (1040) Drill failed to execute the query: 
> SELECT * FROM `realhive`.`tpch_text_30`.`lineitem`[30024]Query execution 
> error. Details:[RemoteRpcException: Failure while running fragment.[ 
> fb97e7be-d09e-46fe-8728-9577fd0d8795 on ip-10-12-62-65
> Log files with debug level for the Drillbits on the master node as well as 
> the core nodes of the cluster are attached.
> Also the connection through the ODBC driver on Linux 32 bit was "Direct" to 
> the drillbit on the master node of the Hadoop cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-2767) Fragment error on TPCH Scale Factor 30 on a query that completed successfully previously

Reply via email to