Re: VolcanoPlanner vs HepPlanner

2015-10-07 Thread Raajay Viswanathan
. Thanks Raajay > On Oct 7, 2015, at 11:58 AM, John Pullokkaran <jpullokka...@hortonworks.com> > wrote: > > This would be a broad change. > Hep Planner does enumerate different join orders through > “LoptOptimizeJoinRule". > > Volcano planner is not used as it

VolcanoPlanner vs HepPlanner

2015-10-06 Thread Raajay
es need to be passed to the HiveVolcanoPlanner for effective CBO ? Thanks, Raajay

Hive Compile mode

2015-09-09 Thread Raajay
Is it possible to use Hive only in compile mode ( and not execute the queries) ? The output here would be a DAG say to be executed on TEZ later. Thanks Raajay

Re: Hive Compile mode

2015-09-09 Thread Raajay
Ah okay thanks! On Wed, Sep 9, 2015 at 10:44 PM, Jeff Zhang <zjf...@gmail.com> wrote: > Use explain > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain > > > > On Thu, Sep 10, 2015 at 11:07 AM, Raajay <raaja...@gmail.com> wrote: >

Error upon XML serialization.

2015-09-03 Thread Raajay
However, I was not successful in deserializing the object after writing the serialized form to disk. So, 1. Does it make sense to serialize QueryPlan ? 2. If yes, what are the correct configurations ? 3. If not, which is the ideal data structure to serialize after the query compilation stage ?

Hive - Serializing Query Plans

2015-09-02 Thread Raajay
ery object. For execution, I need the QueryPlan.java object. How to go from api.Query (Thrift Generated) to QueryPlan.java ? Thanks Raajay

Re: Serializing dags

2015-09-01 Thread Raajay
I see from the docs that QueryPlan can be serialized to string using the toThriftJSONString() function. How do de-serialize it ? Any pointers would be helpful. Thanks, Raajay On Tue, Sep 1, 2015 at 11:26 AM, Raajay <raaja...@gmail.com> wrote: > Hi Canan, > > The changes that

Re: Serializing dags

2015-09-01 Thread Raajay
Hi Canan, The changes that I am primarily interested are: a. Altering the parallelism of the DAG b. Change task location hints etc.. In general, I want to make these alterations and run the DAGs on tez, without having to go through the hive pipeline. Raajay On Mon, Aug 31, 2015 at 11:42 PM

Serializing dags

2015-08-31 Thread Raajay
Hello, Currently, I am running Hive on Tez. I wish to make some changes to the DAGs generated by HIve before running on Tez/Yarn. Which data structure should i serialize ? DAG or DagPlan ? - Raajay

Re: Run multiple queries simultaneously

2015-08-25 Thread Raajay
Noam, I am concerned with cases where the network is a bottleneck. Will i be able control it in YARN ? Ideally, I would like to run multiple queries simultaneously. Raajay On Tue, Aug 25, 2015 at 9:31 AM, Noam Hasson noam.has...@kenshoo.com wrote: I would just limit the resources given

Run multiple queries simultaneously

2015-08-25 Thread Raajay
Hello, I want to compare the running time of an query when run alone against the run time in presence of other queries. What is the ideal setup required to run this experiment ? Should I have two Hive CLI's open and issue queries simultaneously ? How to script such experiment in Hive ? Raajay

CBO - get cost of the plan

2015-08-24 Thread Raajay
cost = {0}, id = 47 The number of rows as displayed here is 1.0, which is clearly not the correct value. - Raajay.

Re: CBO - get cost of the plan

2015-08-24 Thread Raajay
commands, I find that CBO optimization is ignored as expected. Perhaps I am missing some configuration. I print out the calcite optimized plans, using the RelOptUtil.toString() helper on calciteOptimizedPlan at the end of apply function in CalcitePlannerAction. - Raajay Query = -- Set

Re: CBO - get cost of the plan

2015-08-24 Thread Raajay
are |tableA| = |tableB| = 7E6 and |tableA join tableB| = 1.76E7 (values highlighted in red in the log snippet above) Should I be able to explicitly specify it somewhere, so thats gathering is accurate ? Also, what is the definition of cumulative cost ? Thanks for the help, Raajay On Mon, Aug 24, 2015 at 8

Hive CBO - Calcite Interface

2015-08-10 Thread Raajay
Operator trees) with cost lesser than a threshold. 2. Is there an interface for Hive to get the absolute cost (based on Hive Cost Factory) of a operator tree returned by Calcite ? Thanks, Raajay

Running hive on tez locally

2015-08-07 Thread Raajay
I have been running Hive queries on a single node (no HDFS). I realize that the queries get compiled as map-reduce jobs and not as TEZ jobs even though hive.execution.engine=tez is set. Is that expected ? If yes, what is the ideal environment for debugging hive on tez? Raajay

View debug logs

2015-07-30 Thread Raajay
Hello everyone, How do I view the logs generated using log4j logger while running the query tests from itest/qtest ? Also, how to set the log4j properties, since I need to view the most detailed logs. Thanks, Raajay

Hive debug in eclipse

2015-07-30 Thread Raajay
(involving end to end processing of queries) rather than unit tests for a single module/class. 2. Are there other alternatives for speeding up the editing-testing cycle ? Thanks Raajay

Semantic Analysis Run Through

2015-07-30 Thread Raajay
/ explanations will be helpful. Thanks, Raajay

Re: Cost based optimization

2015-06-26 Thread Raajay
Awesome! Thanks John. I would be grateful if you could point me to the files in the source code, that are primarily responsible for Query Planning. Thanks, Raajay On Thu, Jun 25, 2015 at 4:45 PM, John Pullokkaran jpullokka...@hortonworks.com wrote: Hive does look in to alternate join

Cost based optimization

2015-06-25 Thread Raajay
Hello Everyone, A quick question on the cost-based optimization module in Hive. Does the latest version support query plan generation with alternate join orders ? Thanks Raajay