Re: Questions related to HBase general use

2015-05-14 Thread kulkarni.swar...@gmail.com
+ hive-dev

Thanks for your question. We recently have been busy adding quite a few
features on top on Hive/HBase Integration to make it more stable and easy
to use. We also did a talk very recently at HBaseCon 2015 showing off the
latest improvements. Slides here[1]. Like Jerry mentioned, if you run a
regular query from Hive on an HBase table with billions of rows, it is
going to be slow as it would trigger a full table scan. However, Hive has
smarts around filter pushdown where the attributes in a where clause are
pushed down and converted to scan ranges and filters to optimize the scan.
Plus with the recent Hive On Spark uplift, I see this integration take
benefit of that as well.

That said, we here use this integration daily over billions of rows to run
hundreds of queries without any issues. Since you mentioned that you are a
already a big consumer of Hive, I would highly recommend to give this a
spin and report back with whatever issues you face so we can work on making
this more stable.

Hope that helps.

Swarnim

[1]
https://docs.google.com/presentation/d/1K2A2NMsNbmKWuG02aUDxsLo0Lal0lhznYy8SB6HjC9U/edit#slide=id.p

On Wed, May 13, 2015 at 6:26 PM, Nick Dimiduk ndimi...@gmail.com wrote:

 + Swarnim, who's expert on HBase/Hive integration.

 Yes, snapshots may be interesting for you. I believe Hive can access HBase
 timestamps, exposed as a virtual column. It's assumed across there whole
 row however, not per cell.

 On Sun, May 10, 2015 at 9:14 PM, Jerry He jerry...@gmail.com wrote:

 Hi, Yong

 You have a good understanding of the benefit of HBase already.
 Generally speaking, HBase is suitable for real time read/write to your big
 data set.
 Regarding the HBase performance evaluation tool, the 'read' test use HBase
 'get'. For 1m rows, the test would issue 1m 'get' (and RPC) to the server.
 The 'scan' test scans the table and transfers the rows to the client in
 batches (e.g. 100 rows at a time), which will take shorter time for the
 whole test to complete for the same number of rows.
 The hive/hbase integration, as you said, needs more consideration.
 1) The performance.  Hive access HBase via HBase client API, which
 involves
 going to the HBase server for all the data access. This will slow things
 down.
 There are a couple of things you can explore. e.g. Hive/HBase snapshot
 integration. This would provide direct access to HBase hfiles.
 2) In your email, you are interested in HBase's capability of storing
 multiple versions of data.  You need to consider if Hive supports this
 HBase feature. i.e provide you access to multi versions. As I can
 remember,
 it is not fully.

 Jerry


 On Thu, May 7, 2015 at 6:18 PM, java8964 java8...@hotmail.com wrote:

  Hi,
  I am kind of new to HBase. Currently our production run IBM BigInsight
 V3,
  comes with Hadoop 2.2 and HBase 0.96.0.
  We are mostly using HDFS and Hive/Pig for our BigData project, it works
  very good for our big datasets. Right now, we have a one dataset needs
 to
  be loaded from Mysql, about 100G, and will have about Gs change daily.
 This
  is a very important slow change dimension data, we like to sync between
  Mysql and BigData platform.
  I am thinking of using HBase to store it, instead of refreshing the
 whole
  dataset in HDFS, due to:
  1) HBase makes the merge the change very easy.2) HBase could store all
 the
  changes in the history, as a function out of box. We will replicate all
 the
  changes from the binlog level from Mysql, and we could keep all changes
 in
  HBase (or long history), then it can give us some insight that cannot be
  done easily in HDFS.3) HBase could give us the benefit to access the
 data
  by key fast, for some cases.4) HBase is available out of box.
  What I am not sure is the Hive/HBase integration. Hive is the top tool
 in
  our environment. If one dataset stored in Hbase (even only about 100G as
  now), the join between it with the other Big datasets in HDFS worries
 me. I
  read quite some information about Hive/HBase integration, and feel that
 it
  is not really mature, as not too many usage cases I can find online,
  especially on performance. There are quite some JIRAs related to make
 Hive
  utilize the HBase for performance in MR job are still pending.
  I want to know other people experience to use HBase in this way. I
  understand HBase is not designed as a storage system for Data Warehouse
  component or analytics engine. But the benefits to use HBase in this
 case
  still attractive me. If my use cases of HBase is mostly read or full
 scan
  the data, how bad it is compared to HDFS in the same cluster? 3x? 5x?
  To help me understand the read throughput of HBase, I use the HBase
  performance evaluation tool, but the output is quite confusing. I have 2
  clusters, one is with 5 nodes with 3 slaves all running on VM (Each with
  24G + 4 cores, so cluster has 12 mappers + 6 reducers), another is real
  cluster with 5 nodes with 3 slaves with 64G + 24 cores and with (48
 mapper
  

Re: Review Request 34197: HIVE-10706 Make vectorized_timestamp_funcs test more stable

2015-05-14 Thread Swarnim Kulkarni

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34197/#review83782
---

Ship it!


Ship It!

- Swarnim Kulkarni


On May 14, 2015, 6:28 a.m., Alexander Pivovarov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34197/
 ---
 
 (Updated May 14, 2015, 6:28 a.m.)
 
 
 Review request for hive and Jason Dere.
 
 
 Bugs: HIVE-10706
 https://issues.apache.org/jira/browse/HIVE-10706
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-10706 Make vectorized_timestamp_funcs test more stable
 
 
 Diffs
 -
 
   ql/src/test/queries/clientpositive/vectorized_timestamp_funcs.q 
 8a2d5aaf5fb0396e551bdefdde507d1e9902919b 
   ql/src/test/results/clientpositive/spark/vectorized_timestamp_funcs.q.out 
 304458215b4dcbc4d49321ba5f14ca5a87f2ec26 
   ql/src/test/results/clientpositive/tez/vectorized_timestamp_funcs.q.out 
 fa3ed21232004d710b33cadac66680eabaca2c8a 
   ql/src/test/results/clientpositive/vectorized_timestamp_funcs.q.out 
 31a96c68b22bd5332fb71b52982de71710df65fa 
 
 Diff: https://reviews.apache.org/r/34197/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Alexander Pivovarov
 




[jira] [Created] (HIVE-10709) Update Avro version to 1.7.7

2015-05-14 Thread Swarnim Kulkarni (JIRA)
Swarnim Kulkarni created HIVE-10709:
---

 Summary: Update Avro version to 1.7.7
 Key: HIVE-10709
 URL: https://issues.apache.org/jira/browse/HIVE-10709
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni


We should update the avro version to 1.7.7 to consumer some of the nicer 
compatibility features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34143: Fix stats annotation

2015-05-14 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34143/
---

(Updated May 14, 2015, 4:50 p.m.)


Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Changes
---

update the failing q tests.


Repository: hive-git


Description
---

This is a umbrella patch for a bunch of issues: HIVE-8769 Physical optimizer : 
Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not 
detected) HIVE-9392 JoinStatsRule miscalculates join cardinality as incorrect 
NDV is used due to column names having duplicated fqColumnName HIVE-10107 Union 
All : Vertex missing stats resulting in OOM and in-efficient plans


Diffs (updated)
-

  hbase-handler/src/test/results/positive/external_table_ppd.q.out 6d48edb 
  hbase-handler/src/test/results/positive/hbase_custom_key2.q.out c9b5a84 
  hbase-handler/src/test/results/positive/hbase_custom_key3.q.out 76848e0 
  hbase-handler/src/test/results/positive/hbase_ppd_key_range.q.out 6174bfb 
  hbase-handler/src/test/results/positive/hbase_pushdown.q.out 8a979bf 
  hbase-handler/src/test/results/positive/hbase_queries.q.out 7863f69 
  hbase-handler/src/test/results/positive/hbase_timestamp.q.out 3aae7d0 
  hbase-handler/src/test/results/positive/ppd_key_ranges.q.out 5936735 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
0de7488 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 44269f0 
  ql/src/java/org/apache/hadoop/hive/ql/plan/AbstractOperatorDesc.java 0a83440 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java c420190 
  ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java f66279f 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 508d880 
  ql/src/test/results/clientpositive/annotate_stats_filter.q.out e8cd06d 
  ql/src/test/results/clientpositive/annotate_stats_limit.q.out 5f8b6f8 
  ql/src/test/results/clientpositive/annotate_stats_part.q.out 241192b 
  ql/src/test/results/clientpositive/annotate_stats_select.q.out 753ab4e 
  ql/src/test/results/clientpositive/annotate_stats_table.q.out 9bf82ac 
  ql/src/test/results/clientpositive/auto_join30.q.out b068493 
  ql/src/test/results/clientpositive/auto_join31.q.out 1e19dd0 
  ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
  ql/src/test/results/clientpositive/auto_join_stats.q.out 9100762 
  ql/src/test/results/clientpositive/auto_join_stats2.q.out ed09875 
  ql/src/test/results/clientpositive/auto_join_without_localtask.q.out ce4ad8a 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
  ql/src/test/results/clientpositive/auto_sortmerge_join_14.q.out 43504d8 
  ql/src/test/results/clientpositive/auto_sortmerge_join_15.q.out afd5518 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
  ql/src/test/results/clientpositive/auto_sortmerge_join_6.q.out f039dda 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
  ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out 65aa3ef 
  ql/src/test/results/clientpositive/binarysortable_1.q.out c4ba7e0 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_2.q.out eec099c 
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_4.q.out 1a644a9 
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_5.q.out e4f90e4 
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_6.q.out 307c83b 
  ql/src/test/results/clientpositive/column_access_stats.q.out a779564 
  ql/src/test/results/clientpositive/complex_alias.q.out 133ce91 
  ql/src/test/results/clientpositive/correlationoptimizer1.q.out 0eb1596 
  ql/src/test/results/clientpositive/correlationoptimizer10.q.out 3c3564d 
  ql/src/test/results/clientpositive/correlationoptimizer11.q.out bd86942 
  ql/src/test/results/clientpositive/correlationoptimizer15.q.out b57203e 
  ql/src/test/results/clientpositive/correlationoptimizer2.q.out 43d209f 
  ql/src/test/results/clientpositive/correlationoptimizer3.q.out 5389647 
  ql/src/test/results/clientpositive/correlationoptimizer4.q.out b350816 
  ql/src/test/results/clientpositive/correlationoptimizer5.q.out 6ba3462 
  ql/src/test/results/clientpositive/correlationoptimizer6.q.out be518dc 
  

[jira] [Created] (HIVE-10710) Delete GenericUDF.getConstantLongValue

2015-05-14 Thread Alexander Pivovarov (JIRA)
Alexander Pivovarov created HIVE-10710:
--

 Summary: Delete GenericUDF.getConstantLongValue
 Key: HIVE-10710
 URL: https://issues.apache.org/jira/browse/HIVE-10710
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
Priority: Trivial


GenericUDF.getConstantLongValue has a bug.
Instead of fixing the bug it was suggested to delete the method because it is 
not used in hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Supporting Hadoop-1 and experimental features

2015-05-14 Thread Gopal Vijayaraghavan
Hi,

+1 on the idea.

Having a stable release branch with ongoing fixes where we do not drop
major features would be good all around.

It lets us accelerate the pace of development, drop major features or
rewrite them entirely without dragging everyone else kicking  screaming
into that release.

Cheers,
Gopal

On 5/11/15, 7:17 PM, Sergey Shelukhin ser...@hortonworks.com wrote:

That sounds like a good idea.
Some features could be back ported to branch-1 if viable, but at least new
stuff would not be burdened by Hadoop 1/MR code paths.
Probably also a good place to enable vectorization and other perf features
by default while we make alpha releases.

+1

On 15/5/11, 15:38, Alan Gates alanfga...@gmail.com wrote:

There is a lot of forward-looking work going on in various branches of
Hive:  LLAP, the HBase metastore, and the work to drop the CLI.  It
would be good to have a way to release this code to users so that they
can experiment with it.  Releasing it will also provide feedback to
developers.

At the same time there are discussions on whether to keep supporting
Hadoop-1.  The burden of supporting older, less used functionality such
as Hadoop-1 is becoming ever harder as many new features are added.

I propose that the best way to deal with this would be to make a
branch-1.  We could continue to make new feature releases off of this
branch (1.3, 1.4, etc.).  This branch would not drop old functionality.
This provides stability and continuity for users and developers.

We could then merge these new features branches (LLAP, HBase metastore,
CLI drop) into the trunk, as well as turn on by default newer features
such as the vectorization and ACID.  We could also drop older, less used
features such as support for Hadoop-1 and MapReduce.  It will be a while
before we are ready to make stable, production ready releases of this
code.  But we could start making alpha quality releases soon.  We would
call these releases 2.x, to stress the non-backward compatible changes
such as dropping Hadoop-1.  This will give users a chance to play with
the new code and developers a chance to get feedback.

Thoughts?





Re: JIRA notifications

2015-05-14 Thread Prasanth Jayachandran
@Swarnim.. 
Generating patch with git diff needs to include the full index for it to be 
uploaded to review board. “git diff —full-index”.
https://code.google.com/p/reviewboard/issues/detail?id=3115

- Prasanth

 On May 14, 2015, at 9:14 AM, Thejas Nair thejas.n...@gmail.com wrote:
 
 Now that we have moved to git, you can try using github pull request instead.
 It also  integrates with jira.
 More git instructions - http://accumulo.apache.org/git.html
 
 
 On Thu, May 14, 2015 at 8:01 AM, kulkarni.swar...@gmail.com
 kulkarni.swar...@gmail.com wrote:
 Also not sure if it's related but seems like RB has been pretty sluggish
 lately too for me. It takes forever for a patch to submitted and a review
 request created(the latest one is still running for past 30 minutes with no
 output)
 
 On Wed, May 13, 2015 at 4:26 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:
 
 By the way, we still need to add iss...@hive.apache.org to the
 website's Mailing
 Lists http://hive.apache.org/mailing_lists.html page -- see HIVE-10124
 https://issues.apache.org/jira/browse/HIVE-10124.
 
 -- Lefty
 
 On Wed, May 13, 2015 at 2:16 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:
 
 But some notifications and comments aren't making it onto any Hive
 mailing
 list -- see INFRA-9221 https://issues.apache.org/jira/browse/INFRA-9221
 (please
 add your own comments and examples).  This means the mail archives don't
 have a complete record of JIRA activity.
 
 -- Lefty
 
 On Wed, May 13, 2015 at 10:03 AM, Thejas Nair thejas.n...@gmail.com
 wrote:
 
 comments now added go to iss...@hive.apache.org .
 emails for JIRAs created should still go to dev@
 
 
 On Wed, May 13, 2015 at 9:25 AM, kulkarni.swar...@gmail.com
 kulkarni.swar...@gmail.com wrote:
 I noticed that I haven't been getting notifications(or they are really
 delayed) on any of the new JIRAs created/ comments added. Anyone else
 noticing similar issues as well?
 
 --
 Swarnim
 
 
 
 
 
 
 
 --
 Swarnim



Re: [DISCUSS] Hive API passivity

2015-05-14 Thread Thejas Nair
By passivity do you mean backward compatibility ?
Not all API's have same level of maturity, and the audience for them
can also be different.

Public api's are supposed to be marked with the annotations under
org.apache.hadoop.hive.common.classification.InterfaceAudience as
Public, and the expectations regarding backward compatibility set
using InterfaceStability annotations.

For example, the UDF apis should be marked as @Public and @Stable.
However, api's for new functionality might be marked @unstable or
@evolving.



On Thu, May 14, 2015 at 9:19 AM, kulkarni.swar...@gmail.com
kulkarni.swar...@gmail.com wrote:
 While reviewing some of the recent patches, I came across a few with
 non-passive changes and or discussion around them. I was wondering what
 kind of passivity guarantees should we provide to our consumers? I
 understand that Hive API is probably not as widely used as some of its
 peers in the ecosystem like HBase. But should that be something we should
 start thinking on especially around user facing interfaces like UDFs,
 SerDes, StorageHandlers etc? More so given that we are 1.0 now?
 IMO we should avoid doing any of such changes and/or if we have to do so
 with a major version bump for the next release.

 Thoughts?

 --
 Swarnim


Review Request 34223: HIVE-10710 Delete GenericUDF.getConstantLongValue

2015-05-14 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34223/
---

Review request for hive, Ashutosh Chauhan and Jason Dere.


Bugs: HIVE-10710
https://issues.apache.org/jira/browse/HIVE-10710


Repository: hive-git


Description
---

HIVE-10710 Delete GenericUDF.getConstantLongValue


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java 
b043bdc882af7c0b83787526a5a55c9dc29c6681 

Diff: https://reviews.apache.org/r/34223/diff/


Testing
---


Thanks,

Alexander Pivovarov



Re: JIRA notifications

2015-05-14 Thread Alexander Pivovarov
You can use the following command to create new review. It takes about 3-5
sec
$ rbt post -g yes

To update the review you can run.
$ rbt post -u -g yes

On Thu, May 14, 2015 at 10:48 AM, Prasanth Jayachandran 
pjayachand...@hortonworks.com wrote:

 @Swarnim..
 Generating patch with git diff needs to include the full index for it to
 be uploaded to review board. “git diff —full-index”.
 https://code.google.com/p/reviewboard/issues/detail?id=3115

 - Prasanth

  On May 14, 2015, at 9:14 AM, Thejas Nair thejas.n...@gmail.com wrote:
 
  Now that we have moved to git, you can try using github pull request
 instead.
  It also  integrates with jira.
  More git instructions - http://accumulo.apache.org/git.html
 
 
  On Thu, May 14, 2015 at 8:01 AM, kulkarni.swar...@gmail.com
  kulkarni.swar...@gmail.com wrote:
  Also not sure if it's related but seems like RB has been pretty sluggish
  lately too for me. It takes forever for a patch to submitted and a
 review
  request created(the latest one is still running for past 30 minutes
 with no
  output)
 
  On Wed, May 13, 2015 at 4:26 PM, Lefty Leverenz 
 leftylever...@gmail.com
  wrote:
 
  By the way, we still need to add iss...@hive.apache.org to the
  website's Mailing
  Lists http://hive.apache.org/mailing_lists.html page -- see
 HIVE-10124
  https://issues.apache.org/jira/browse/HIVE-10124.
 
  -- Lefty
 
  On Wed, May 13, 2015 at 2:16 PM, Lefty Leverenz 
 leftylever...@gmail.com
  wrote:
 
  But some notifications and comments aren't making it onto any Hive
  mailing
  list -- see INFRA-9221 
 https://issues.apache.org/jira/browse/INFRA-9221
  (please
  add your own comments and examples).  This means the mail archives
 don't
  have a complete record of JIRA activity.
 
  -- Lefty
 
  On Wed, May 13, 2015 at 10:03 AM, Thejas Nair thejas.n...@gmail.com
  wrote:
 
  comments now added go to iss...@hive.apache.org .
  emails for JIRAs created should still go to dev@
 
 
  On Wed, May 13, 2015 at 9:25 AM, kulkarni.swar...@gmail.com
  kulkarni.swar...@gmail.com wrote:
  I noticed that I haven't been getting notifications(or they are
 really
  delayed) on any of the new JIRAs created/ comments added. Anyone
 else
  noticing similar issues as well?
 
  --
  Swarnim
 
 
 
 
 
 
 
  --
  Swarnim




[jira] [Created] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem

2015-05-14 Thread Jason Dere (JIRA)
Jason Dere created HIVE-10711:
-

 Summary: Tez HashTableLoader attempts to allocate more memory than 
available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
 Key: HIVE-10711
 URL: https://issues.apache.org/jira/browse/HIVE-10711
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Tez HashTableLoader bases its memory allocation on 
HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the 
process max memory then this can result in the HashTableLoader trying to use 
more memory than available to the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: JIRA notifications

2015-05-14 Thread kulkarni.swar...@gmail.com
Also not sure if it's related but seems like RB has been pretty sluggish
lately too for me. It takes forever for a patch to submitted and a review
request created(the latest one is still running for past 30 minutes with no
output)

On Wed, May 13, 2015 at 4:26 PM, Lefty Leverenz leftylever...@gmail.com
wrote:

 By the way, we still need to add iss...@hive.apache.org to the
 website's Mailing
 Lists http://hive.apache.org/mailing_lists.html page -- see HIVE-10124
 https://issues.apache.org/jira/browse/HIVE-10124.

 -- Lefty

 On Wed, May 13, 2015 at 2:16 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:

  But some notifications and comments aren't making it onto any Hive
 mailing
  list -- see INFRA-9221 https://issues.apache.org/jira/browse/INFRA-9221
 (please
  add your own comments and examples).  This means the mail archives don't
  have a complete record of JIRA activity.
 
  -- Lefty
 
  On Wed, May 13, 2015 at 10:03 AM, Thejas Nair thejas.n...@gmail.com
  wrote:
 
  comments now added go to iss...@hive.apache.org .
  emails for JIRAs created should still go to dev@
 
 
  On Wed, May 13, 2015 at 9:25 AM, kulkarni.swar...@gmail.com
  kulkarni.swar...@gmail.com wrote:
   I noticed that I haven't been getting notifications(or they are really
   delayed) on any of the new JIRAs created/ comments added. Anyone else
   noticing similar issues as well?
  
   --
   Swarnim
 
 
 




-- 
Swarnim


Re: Review Request 33881: HIVE-10623 Implement hive cli options using beeline functionality

2015-05-14 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33881/
---

(Updated May 14, 2015, 3:28 p.m.)


Review request for hive and Xuefu Zhang.


Bugs: HIVE-10623
https://issues.apache.org/jira/browse/HIVE-10623


Repository: hive-git


Description
---

Changes:
1. Support the hive cli options including database, e, !, H, f.
2. Add error handler for using f and e together
3. Add error handler for invalid option


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 0da15f6 
  beeline/src/java/org/apache/hive/beeline/cli/CliOptionsProcessor.java 
PRE-CREATION 
  beeline/src/java/org/apache/hive/beeline/cli/HiveCli.java PRE-CREATION 
  beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java PRE-CREATION 
  beeline/src/test/resources/hive-site.xml PRE-CREATION 

Diff: https://reviews.apache.org/r/33881/diff/


Testing
---

Newly add unit test passed locally.


Thanks,

cheng xu



[DISCUSS] Hive API passivity

2015-05-14 Thread kulkarni.swar...@gmail.com
While reviewing some of the recent patches, I came across a few with
non-passive changes and or discussion around them. I was wondering what
kind of passivity guarantees should we provide to our consumers? I
understand that Hive API is probably not as widely used as some of its
peers in the ecosystem like HBase. But should that be something we should
start thinking on especially around user facing interfaces like UDFs,
SerDes, StorageHandlers etc? More so given that we are 1.0 now?
IMO we should avoid doing any of such changes and/or if we have to do so
with a major version bump for the next release.

Thoughts?

-- 
Swarnim


[jira] [Created] (HIVE-10708) Add SchemaCompatibility check to AvroDeserializer

2015-05-14 Thread Swarnim Kulkarni (JIRA)
Swarnim Kulkarni created HIVE-10708:
---

 Summary: Add SchemaCompatibility check to AvroDeserializer
 Key: HIVE-10708
 URL: https://issues.apache.org/jira/browse/HIVE-10708
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni


Avro provides a nice API[1] to check if the given reader schema can be used to 
deserialize the data given its writer schema. I think it would be super nice to 
integrate this into the AvroDeserializer so that we can fail fast and 
gracefully if there is a bad schema compatibility

[1] 
https://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/SchemaCompatibility.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: JIRA notifications

2015-05-14 Thread Thejas Nair
Now that we have moved to git, you can try using github pull request instead.
It also  integrates with jira.
More git instructions - http://accumulo.apache.org/git.html


On Thu, May 14, 2015 at 8:01 AM, kulkarni.swar...@gmail.com
kulkarni.swar...@gmail.com wrote:
 Also not sure if it's related but seems like RB has been pretty sluggish
 lately too for me. It takes forever for a patch to submitted and a review
 request created(the latest one is still running for past 30 minutes with no
 output)

 On Wed, May 13, 2015 at 4:26 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:

 By the way, we still need to add iss...@hive.apache.org to the
 website's Mailing
 Lists http://hive.apache.org/mailing_lists.html page -- see HIVE-10124
 https://issues.apache.org/jira/browse/HIVE-10124.

 -- Lefty

 On Wed, May 13, 2015 at 2:16 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:

  But some notifications and comments aren't making it onto any Hive
 mailing
  list -- see INFRA-9221 https://issues.apache.org/jira/browse/INFRA-9221
 (please
  add your own comments and examples).  This means the mail archives don't
  have a complete record of JIRA activity.
 
  -- Lefty
 
  On Wed, May 13, 2015 at 10:03 AM, Thejas Nair thejas.n...@gmail.com
  wrote:
 
  comments now added go to iss...@hive.apache.org .
  emails for JIRAs created should still go to dev@
 
 
  On Wed, May 13, 2015 at 9:25 AM, kulkarni.swar...@gmail.com
  kulkarni.swar...@gmail.com wrote:
   I noticed that I haven't been getting notifications(or they are really
   delayed) on any of the new JIRAs created/ comments added. Anyone else
   noticing similar issues as well?
  
   --
   Swarnim
 
 
 




 --
 Swarnim


[jira] [Created] (HIVE-10712) Hive on Apache Flink

2015-05-14 Thread Greg Senia (JIRA)
Greg Senia created HIVE-10712:
-

 Summary: Hive on Apache Flink
 Key: HIVE-10712
 URL: https://issues.apache.org/jira/browse/HIVE-10712
 Project: Hive
  Issue Type: Wish
Reporter: Greg Senia


Flink as an open-source data analytics cluster computing framework has gained 
some momentum recently. This initiative will provide user a new alternative so 
that those user can consolidate their backend.
Secondly, providing such an alternative further increases Hive's adoption as it 
exposes Flink users to a viable, feature-rich de facto standard SQL tools on 
Hadoop.
Finally, allowing Hive to run on Flink also has performance benefits. Hive 
queries, especially those involving multiple reducer stages, will run faster, 
thus improving user experience as Tez/Spark does.
This is an umbrella JIRA which will cover many coming subtask.  Feedback from 
the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10713) Update to HBase 1.0

2015-05-14 Thread Swarnim Kulkarni (JIRA)
Swarnim Kulkarni created HIVE-10713:
---

 Summary: Update to HBase 1.0
 Key: HIVE-10713
 URL: https://issues.apache.org/jira/browse/HIVE-10713
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni


HBase is now 1.0. We should look into upgrading the HBase deps in Hive to 1.0 
as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10714) Bloom filter column names specification should be case insensitive

2015-05-14 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-10714:


 Summary: Bloom filter column names specification should be case 
insensitive
 Key: HIVE-10714
 URL: https://issues.apache.org/jira/browse/HIVE-10714
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Column names specified for orc bloom filter creation should be case insensitive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Jason Dere


 On May 12, 2015, 6:04 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java, 
  line 33
  https://reviews.apache.org/r/34059/diff/1/?file=955664#file955664line33
 
  booleans in java are false by default

I find this provides better readability. Are there any negatives to having the 
initial value set here?


 On May 12, 2015, 6:04 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java, 
  line 50
  https://reviews.apache.org/r/34059/diff/1/?file=955664#file955664line50
 
  It is not necessary but I do not see a reason why the visibility of 
  this method should be reduced. Should it be public as all others?

The public functionality we need from that class is provided by the 
Iterator/Iterable interfaces, I didn't think it would be necessary to expose 
reset() since it is only really being used by the outer class.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83359
---


On May 11, 2015, 9:48 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 11, 2015, 9:48 p.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties eeb46cc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 15c747e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Jason Dere


 On May 12, 2015, 6:42 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java, line 439
  https://reviews.apache.org/r/34059/diff/1/?file=955675#file955675line439
 
  trailing space

will fix


 On May 12, 2015, 6:42 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java, line 315
  https://reviews.apache.org/r/34059/diff/1/?file=955677#file955677line315
 
  Remove this line and add String type declaration 3 lines below. Do not 
  confuse GC.

will fix


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83371
---


On May 11, 2015, 9:48 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 11, 2015, 9:48 p.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties eeb46cc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 15c747e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Jason Dere


 On May 12, 2015, 5:51 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 674
  https://reviews.apache.org/r/34059/diff/1/?file=955661#file955661line674
 
  I think it's better to use Map.Entry here to avoid unnecessary lookup 
  get(pos)
  Map.Entry provides getKey, getValue, setValue methods.

will fix


 On May 12, 2015, 5:51 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 679
  https://reviews.apache.org/r/34059/diff/1/?file=955661#file955661line679
 
  the same recommendation as avove

will fix


 On May 12, 2015, 5:51 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 715
  https://reviews.apache.org/r/34059/diff/1/?file=955661#file955661line715
 
  Using replace(char, char) is faster than replace(CharSequence target, 
  CharSequence replacement) because it is not using 
  Pattern.compile().matcher().replaceAll API
  
  Can you use replace('.', '_') instead of replace(., _)?

will fix


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83356
---


On May 11, 2015, 9:48 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 11, 2015, 9:48 p.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties eeb46cc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 15c747e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Jason Dere


 On May 12, 2015, 6:26 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java, line 
  95
  https://reviews.apache.org/r/34059/diff/1/?file=955671#file955671line95
 
  usually static Log should be private because superclass static methods 
  should use their own static Log to avoid confusion.

will change to private


 On May 12, 2015, 6:26 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java, line 
  1094
  https://reviews.apache.org/r/34059/diff/1/?file=955671#file955671line1094
 
  Can you use Map.Entry to avoid unnecesary lookup 3 lines below?

will fix


 On May 12, 2015, 6:26 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java, 
  line 107
  https://reviews.apache.org/r/34059/diff/1/?file=955672#file955672line107
 
  ReduceSinkOperator uses Object.hashCode() and equals() methods.
  HashSet algo relies on hashCode/equals methods

So that means equals() only works if it is the exact same ReduceSinkOperator 
object. This should be ok for our usage, if we are referring to the same 
ReduceSinkOperator, we should be using that exact same object.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83362
---


On May 11, 2015, 9:48 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 11, 2015, 9:48 p.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties eeb46cc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 15c747e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Jason Dere


 On May 12, 2015, 6:35 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java, line 109
  https://reviews.apache.org/r/34059/diff/1/?file=955673#file955673line109
 
  trailing space

will fix


 On May 12, 2015, 6:35 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java, line 423
  https://reviews.apache.org/r/34059/diff/1/?file=955675#file955675line423
 
  Why calling getEntry(key) two times consequently? 
  containsKey() and get() call getEntry internally
  
  Just call get(rs) one time, check thet result is not null and remove 
  the second get(rs)

will fix


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83367
---


On May 11, 2015, 9:48 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 11, 2015, 9:48 p.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties eeb46cc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 15c747e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




Review Request 34238: HIVE-10709 Update avro dependency to 1.7.7

2015-05-14 Thread Swarnim Kulkarni

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34238/
---

Review request for hive and Alexander Pivovarov.


Bugs: HIVE-10709
https://issues.apache.org/jira/browse/HIVE-10709


Repository: hive-git


Description
---

HIVE-10709 Update avro dependency to 1.7.7


Diffs
-

  pom.xml 2e4ca36f31f2bbe89f9c0bb90ab9b4203085e773 

Diff: https://reviews.apache.org/r/34238/diff/


Testing
---


Thanks,

Swarnim Kulkarni



fixed couple q tests which failed in recent builds. Need committer review

2015-05-14 Thread Alexander Pivovarov
HIVE-10665 https://issues.apache.org/jira/browse/HIVE-10665
udaf_percentile_approx_23.q

HIVE-10706 https://issues.apache.org/jira/browse/HIVE-10706
vectorized_timestamp_funcs.q


Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Alexander Pivovarov


 On May 12, 2015, 6:26 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java, 
  line 107
  https://reviews.apache.org/r/34059/diff/1/?file=955672#file955672line107
 
  ReduceSinkOperator uses Object.hashCode() and equals() methods.
  HashSet algo relies on hashCode/equals methods
 
 Jason Dere wrote:
 So that means equals() only works if it is the exact same 
 ReduceSinkOperator object. This should be ok for our usage, if we are 
 referring to the same ReduceSinkOperator, we should be using that exact same 
 object.

Do you want to use IdentityHashMap then?
This class implements the Map interface with a hash table, using 
reference-equality in place of object-equality when comparing keys (and 
values). In other words, in an IdentityHashMap, two keys k1 and k2 are 
considered equal if and only if (k1==k2)


- Alexander


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83362
---


On May 11, 2015, 9:48 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 11, 2015, 9:48 p.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties eeb46cc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 15c747e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




Re: [VOTE] Apache Hive 1.2.0 release candidate 4

2015-05-14 Thread Gunther Hagleitner
?+1 ditto. same checks as last time.


From: Alan Gates alanfga...@gmail.com
Sent: Wednesday, May 13, 2015 1:35 PM
To: dev@hive.apache.org
Subject: Re: [VOTE] Apache Hive 1.2.0 release candidate 4

+1, same checks as last vote.

Alan.

[cid:part1.05080005.07060302@gmail.com]
Sushanth Sowmyanmailto:khorg...@gmail.com
May 13, 2015 at 11:50
Hi Folks,

We've cleared all the blockers listed for 1.2.0 release, either
committing them, or deferring out to an eventual 1.2.1 stabilization
release. (Any deferrals were a result of discussion between myself and
the committer responsible for the issue.) More details are available
here : https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status

Apache Hive 1.2.0 Release Candidate 4 is available here:

https://people.apache.org/~khorgath/releases/1.2.0_RC4/artifacts/

My public key used for signing is as available from the hive
committers key list : http://www.apache.org/dist/hive/KEYS

Maven artifacts are available here:

https://repository.apache.org/content/repositories/orgapachehive-1035

Source tag for RC4 is up on the apache git repo as tag
release-1.2.0-rc4 (Browseable view over at
https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=38c3daef84bafb13bf911ec6c69d7640430fba70
)

Since this has minimal changes from the previous RC, I would further
request that this vote conclude in 30 hours(which is past the 72 hr
time from the previous RC announcement) if we have enough +1s in the
meanwhile.

Hive PMC Members: Please test and vote.

Thanks,
-Sushanth


Re: [VOTE] Apache Hive 1.2.0 release candidate 4

2015-05-14 Thread Thejas Nair
+1
Verfied the signature and checksum
Build the src.tar.gz , ran queries from both newly built package and
bin.tar.gz. Ran hive cli and beeline queries in local mode.
Checked RELEASE_NOTES.txt , README.txt, LICENSE, NOTICE

On Wed, May 13, 2015 at 1:35 PM, Alan Gates alanfga...@gmail.com wrote:

 +1, same checks as last vote.

 Alan.

   Sushanth Sowmyan khorg...@gmail.com
  May 13, 2015 at 11:50
 Hi Folks,

 We've cleared all the blockers listed for 1.2.0 release, either
 committing them, or deferring out to an eventual 1.2.1 stabilization
 release. (Any deferrals were a result of discussion between myself and
 the committer responsible for the issue.) More details are available
 here :
 https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status

 Apache Hive 1.2.0 Release Candidate 4 is available here:

 https://people.apache.org/~khorgath/releases/1.2.0_RC4/artifacts/

 My public key used for signing is as available from the hive
 committers key list : http://www.apache.org/dist/hive/KEYS

 Maven artifacts are available here:

 https://repository.apache.org/content/repositories/orgapachehive-1035

 Source tag for RC4 is up on the apache git repo as tag
 release-1.2.0-rc4 (Browseable view over at

 https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=38c3daef84bafb13bf911ec6c69d7640430fba70
 )

 Since this has minimal changes from the previous RC, I would further
 request that this vote conclude in 30 hours(which is past the 72 hr
 time from the previous RC announcement) if we have enough +1s in the
 meanwhile.

 Hive PMC Members: Please test and vote.

 Thanks,
 -Sushanth




Review Request 34235: HIVE-10687 Fix avro deserialization issues for evolved unions

2015-05-14 Thread Swarnim Kulkarni

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34235/
---

Review request for hive and Brock Noland.


Bugs: HIVE-10687
https://issues.apache.org/jira/browse/HIVE-10687


Repository: hive-git


Description
---

HIVE-10687 Fix avro deserialization issues for evolved unions


Diffs
-

  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 
e94cd83c064199ba719cc2de222edd0e12401c8c 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 
eb495b4e1fc5874b30936f646b5bdb5aa8734130 
  
serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroObjectInspectorGenerator.java
 c9e7d68b211ebc8c66af243fe85f4f89c6fd6cf3 

Diff: https://reviews.apache.org/r/34235/diff/


Testing
---


Thanks,

Swarnim Kulkarni



Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Alexander Pivovarov


 On May 12, 2015, 6:26 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java, 
  line 107
  https://reviews.apache.org/r/34059/diff/1/?file=955672#file955672line107
 
  ReduceSinkOperator uses Object.hashCode() and equals() methods.
  HashSet algo relies on hashCode/equals methods
 
 Jason Dere wrote:
 So that means equals() only works if it is the exact same 
 ReduceSinkOperator object. This should be ok for our usage, if we are 
 referring to the same ReduceSinkOperator, we should be using that exact same 
 object.
 
 Alexander Pivovarov wrote:
 Do you want to use IdentityHashMap then?
 This class implements the Map interface with a hash table, using 
 reference-equality in place of object-equality when comparing keys (and 
 values). In other words, in an IdentityHashMap, two keys k1 and k2 are 
 considered equal if and only if (k1==k2)
 
 Jason Dere wrote:
 We're using a Set here as opposed to a Map. I'll change to use 
 Sets.newIdentityHashSet() from Guava.

IdentityHashMap contains private KeySet class already
to get its instance you can call keySet() method
e.g.
IdentityHashMapInteger, Object rsMap = new IdentityHashMapInteger, 
Object();
rsMap.put(1, null);
rsMap.put(2, null);
rsMap.put(3, null);
SetInteger rsSet = rsMap.keySet();
System.out.println(rsSet);
[3, 1, 2]


- Alexander


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83362
---


On May 15, 2015, 1:02 a.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 15, 2015, 1:02 a.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties f9c9351 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 e9bd44a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




[jira] [Created] (HIVE-10715) RAT failures - many files do not have ASF licenses

2015-05-14 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-10715:
---

 Summary: RAT failures - many files do not have ASF licenses
 Key: HIVE-10715
 URL: https://issues.apache.org/jira/browse/HIVE-10715
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


Lots of files do not have proper ASF headers included in. We should add them in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Jason Dere


 On May 12, 2015, 6:26 a.m., Alexander Pivovarov wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java, 
  line 107
  https://reviews.apache.org/r/34059/diff/1/?file=955672#file955672line107
 
  ReduceSinkOperator uses Object.hashCode() and equals() methods.
  HashSet algo relies on hashCode/equals methods
 
 Jason Dere wrote:
 So that means equals() only works if it is the exact same 
 ReduceSinkOperator object. This should be ok for our usage, if we are 
 referring to the same ReduceSinkOperator, we should be using that exact same 
 object.
 
 Alexander Pivovarov wrote:
 Do you want to use IdentityHashMap then?
 This class implements the Map interface with a hash table, using 
 reference-equality in place of object-equality when comparing keys (and 
 values). In other words, in an IdentityHashMap, two keys k1 and k2 are 
 considered equal if and only if (k1==k2)

We're using a Set here as opposed to a Map. I'll change to use 
Sets.newIdentityHashSet() from Guava.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83362
---


On May 11, 2015, 9:48 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34059/
 ---
 
 (Updated May 11, 2015, 9:48 p.m.)
 
 
 Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
 
 
 Bugs: HIVE-10673
 https://issues.apache.org/jira/browse/HIVE-10673
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
   itests/src/test/resources/testconfiguration.properties eeb46cc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
 545d7c6 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
 cdabe3a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 15c747e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
  a9082eb 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 d42b643 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 4d84f0f 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7e1dbc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34059/diff/
 
 
 Testing
 ---
 
 q-file tests added
 
 
 Thanks,
 
 Jason Dere
 




Review Request 34248: HIVE-10684 Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files

2015-05-14 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34248/
---

Review request for hive and Sushanth Sowmyan.


Bugs: HIVE-10684
https://issues.apache.org/jira/browse/HIVE-10684


Repository: hive-git


Description
---

Remove binaries from source and fix the failed cases


Diffs
-

  ql/pom.xml f1a6f7d 
  ql/src/test/org/apache/hadoop/hive/ql/session/TestSessionState.java 45ba07e 
  ql/src/test/resources/RefreshedJarClassV1.txt PRE-CREATION 
  ql/src/test/resources/RefreshedJarClassV2.txt PRE-CREATION 

Diff: https://reviews.apache.org/r/34248/diff/


Testing
---

UT passed


Thanks,

cheng xu



[jira] [Created] (HIVE-10717) Fix failed qtest encryption_insert_partition_static test in Jenkin

2015-05-14 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-10717:
---

 Summary: Fix failed qtest encryption_insert_partition_static test 
in Jenkin
 Key: HIVE-10717
 URL: https://issues.apache.org/jira/browse/HIVE-10717
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu


It can be reproduced in Jenkins. See 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3898/testReport/
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: JIRA notifications

2015-05-14 Thread kulkarni.swar...@gmail.com
Yeah I was having issues with both the manual method as well as with rbt.
But seems like things are back to normal now.

Thanks guys!
On May 14, 2015 12:51 PM, Alexander Pivovarov apivova...@gmail.com
wrote:

 You can use the following command to create new review. It takes about 3-5
 sec
 $ rbt post -g yes

 To update the review you can run.
 $ rbt post -u -g yes

 On Thu, May 14, 2015 at 10:48 AM, Prasanth Jayachandran 
 pjayachand...@hortonworks.com wrote:

  @Swarnim..
  Generating patch with git diff needs to include the full index for it to
  be uploaded to review board. “git diff —full-index”.
  https://code.google.com/p/reviewboard/issues/detail?id=3115
 
  - Prasanth
 
   On May 14, 2015, at 9:14 AM, Thejas Nair thejas.n...@gmail.com
 wrote:
  
   Now that we have moved to git, you can try using github pull request
  instead.
   It also  integrates with jira.
   More git instructions - http://accumulo.apache.org/git.html
  
  
   On Thu, May 14, 2015 at 8:01 AM, kulkarni.swar...@gmail.com
   kulkarni.swar...@gmail.com wrote:
   Also not sure if it's related but seems like RB has been pretty
 sluggish
   lately too for me. It takes forever for a patch to submitted and a
  review
   request created(the latest one is still running for past 30 minutes
  with no
   output)
  
   On Wed, May 13, 2015 at 4:26 PM, Lefty Leverenz 
  leftylever...@gmail.com
   wrote:
  
   By the way, we still need to add iss...@hive.apache.org to the
   website's Mailing
   Lists http://hive.apache.org/mailing_lists.html page -- see
  HIVE-10124
   https://issues.apache.org/jira/browse/HIVE-10124.
  
   -- Lefty
  
   On Wed, May 13, 2015 at 2:16 PM, Lefty Leverenz 
  leftylever...@gmail.com
   wrote:
  
   But some notifications and comments aren't making it onto any Hive
   mailing
   list -- see INFRA-9221 
  https://issues.apache.org/jira/browse/INFRA-9221
   (please
   add your own comments and examples).  This means the mail archives
  don't
   have a complete record of JIRA activity.
  
   -- Lefty
  
   On Wed, May 13, 2015 at 10:03 AM, Thejas Nair 
 thejas.n...@gmail.com
   wrote:
  
   comments now added go to iss...@hive.apache.org .
   emails for JIRAs created should still go to dev@
  
  
   On Wed, May 13, 2015 at 9:25 AM, kulkarni.swar...@gmail.com
   kulkarni.swar...@gmail.com wrote:
   I noticed that I haven't been getting notifications(or they are
  really
   delayed) on any of the new JIRAs created/ comments added. Anyone
  else
   noticing similar issues as well?
  
   --
   Swarnim
  
  
  
  
  
  
  
   --
   Swarnim
 
 



Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez

2015-05-14 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/
---

(Updated May 15, 2015, 1:02 a.m.)


Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.


Changes
---

Addressing RB feedback from apivovarov


Bugs: HIVE-10673
https://issues.apache.org/jira/browse/HIVE-10673


Repository: hive-git


Description
---

Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
reducer are unsorted.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
  itests/src/test/resources/testconfiguration.properties f9c9351 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
545d7c6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
cdabe3a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
e9bd44a 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
 a9082eb 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
d42b643 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
f7e1dbc 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 241e9d7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
  ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/34059/diff/


Testing
---

q-file tests added


Thanks,

Jason Dere



Review Request 34249: Case folding with nulls in expression with filter operator

2015-05-14 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34249/
---

Review request for hive and Gopal V.


Bugs: HIVE-10716
https://issues.apache.org/jira/browse/HIVE-10716


Repository: hive-git


Description
---

Case folding with nulls in expression with filter operator


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
 209f717 
  ql/src/test/queries/clientpositive/fold_case.q 3f9e3a3 
  ql/src/test/results/clientpositive/fold_case.q.out de6c43e 
  ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out 45a0cb1 
  ql/src/test/results/clientpositive/fold_when.q.out 51d4767 

Diff: https://reviews.apache.org/r/34249/diff/


Testing
---

New tests.


Thanks,

Ashutosh Chauhan



Re: [VOTE] Apache Hive 1.2.0 release candidate 4

2015-05-14 Thread Sushanth Sowmyan
Sorry folks, we discovered one more issue with the RC - ASF headers were
missing in a couple of files. I'm in the process of spinning out an RC5
with the fix.

And since we already have functional testing of the RC done, and these are
trivial changes from the previous RC, I again propose a shortened RC time
for this RC as well. Those that have tested so far, I thank you for your
efforts and your patience, and request another testing of the next RC.

Thanks,
-Sushanth


On Thu, May 14, 2015 at 3:14 PM, Thejas Nair thejas.n...@gmail.com wrote:

 +1
 Verfied the signature and checksum
 Build the src.tar.gz , ran queries from both newly built package and
 bin.tar.gz. Ran hive cli and beeline queries in local mode.
 Checked RELEASE_NOTES.txt , README.txt, LICENSE, NOTICE

 On Wed, May 13, 2015 at 1:35 PM, Alan Gates alanfga...@gmail.com wrote:

 +1, same checks as last vote.

 Alan.

   Sushanth Sowmyan khorg...@gmail.com
  May 13, 2015 at 11:50
 Hi Folks,

 We've cleared all the blockers listed for 1.2.0 release, either
 committing them, or deferring out to an eventual 1.2.1 stabilization
 release. (Any deferrals were a result of discussion between myself and
 the committer responsible for the issue.) More details are available
 here :
 https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status

 Apache Hive 1.2.0 Release Candidate 4 is available here:

 https://people.apache.org/~khorgath/releases/1.2.0_RC4/artifacts/

 My public key used for signing is as available from the hive
 committers key list : http://www.apache.org/dist/hive/KEYS

 Maven artifacts are available here:

 https://repository.apache.org/content/repositories/orgapachehive-1035

 Source tag for RC4 is up on the apache git repo as tag
 release-1.2.0-rc4 (Browseable view over at

 https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=38c3daef84bafb13bf911ec6c69d7640430fba70
 )

 Since this has minimal changes from the previous RC, I would further
 request that this vote conclude in 30 hours(which is past the 72 hr
 time from the previous RC announcement) if we have enough +1s in the
 meanwhile.

 Hive PMC Members: Please test and vote.

 Thanks,
 -Sushanth





Re: [VOTE] Apache Hive 1.2.0 release candidate 5

2015-05-14 Thread Vikram Dixit K
I built against hadoop1 and hadoop2 and ran the rat tool as well. Ran
a couple of queries.

+1

Thanks
Vikram.

On Thu, May 14, 2015 at 6:30 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 Hi Folks,

 We've cleared all the blockers listed for 1.2.0 release, either
 committing them, or deferring out to an eventual 1.2.1 stabilization
 release. (Any deferrals were a result of discussion between myself and
 the committer responsible for the issue.) More details are available
 here : 
 https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status

 Apache Hive 1.2.0 Release Candidate 5 is available here:

 https://people.apache.org/~khorgath/releases/1.2.0_RC5/artifacts/

 My public key used for signing is as available from the hive
 committers key list : http://www.apache.org/dist/hive/KEYS

 Maven artifacts are available here:

 https://repository.apache.org/content/repositories/orgapachehive-1039

 Source tag for RC5 is up on the apache git repo as tag
 release-1.2.0-rc5 (Browseable view over at
 https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=76b90268084f529852396302884297b3c22fcf00
 )

 Since this has minimal changes from the previous RC, I would further
 request that this vote conclude in 20 hours(which is past the 72 hr
 time from the previous RC announcement) if we have enough +1s in the
 meanwhile.

 Hive PMC Members: Please test and vote.

 Thanks,
 -Sushanth



-- 
Nothing better than when appreciated for hard work.
-Mark


Re: Review Request 34235: HIVE-10687 Fix avro deserialization issues for evolved unions

2015-05-14 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34235/#review83883
---



serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java
https://reviews.apache.org/r/34235/#comment134958

tail space



serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java
https://reviews.apache.org/r/34235/#comment134959

tailing space



serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java
https://reviews.apache.org/r/34235/#comment134964

Do you need to cover null value case?



serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java
https://reviews.apache.org/r/34235/#comment134960

tailing spaces



serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java
https://reviews.apache.org/r/34235/#comment134962

remove space pls


Some minor issues and a question

- cheng xu


On May 14, 2015, 10:07 p.m., Swarnim Kulkarni wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34235/
 ---
 
 (Updated May 14, 2015, 10:07 p.m.)
 
 
 Review request for hive and Brock Noland.
 
 
 Bugs: HIVE-10687
 https://issues.apache.org/jira/browse/HIVE-10687
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-10687 Fix avro deserialization issues for evolved unions
 
 
 Diffs
 -
 
   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 
 e94cd83c064199ba719cc2de222edd0e12401c8c 
   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 
 eb495b4e1fc5874b30936f646b5bdb5aa8734130 
   
 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroObjectInspectorGenerator.java
  c9e7d68b211ebc8c66af243fe85f4f89c6fd6cf3 
 
 Diff: https://reviews.apache.org/r/34235/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Swarnim Kulkarni
 




[jira] [Created] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-14 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-10716:
---

 Summary: Fold case/when udf for expression involving nulls in 
filter operator.
 Key: HIVE-10716
 URL: https://issues.apache.org/jira/browse/HIVE-10716
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] Apache Hive 1.2.0 release candidate 5

2015-05-14 Thread Sushanth Sowmyan
Hi Folks,

We've cleared all the blockers listed for 1.2.0 release, either
committing them, or deferring out to an eventual 1.2.1 stabilization
release. (Any deferrals were a result of discussion between myself and
the committer responsible for the issue.) More details are available
here : https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status

Apache Hive 1.2.0 Release Candidate 5 is available here:

https://people.apache.org/~khorgath/releases/1.2.0_RC5/artifacts/

My public key used for signing is as available from the hive
committers key list : http://www.apache.org/dist/hive/KEYS

Maven artifacts are available here:

https://repository.apache.org/content/repositories/orgapachehive-1039

Source tag for RC5 is up on the apache git repo as tag
release-1.2.0-rc5 (Browseable view over at
https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=76b90268084f529852396302884297b3c22fcf00
)

Since this has minimal changes from the previous RC, I would further
request that this vote conclude in 20 hours(which is past the 72 hr
time from the previous RC announcement) if we have enough +1s in the
meanwhile.

Hive PMC Members: Please test and vote.

Thanks,
-Sushanth


[jira] [Created] (HIVE-10719) Hive metastore failure when alter table rename is attempted.

2015-05-14 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-10719:
-

 Summary: Hive metastore failure when alter table rename is 
attempted.
 Key: HIVE-10719
 URL: https://issues.apache.org/jira/browse/HIVE-10719
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.0.0, 1.2.0, 1.1.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K


{code}
create database newDB location /tmp/;
describe database extended newDB;
use newDB;
create table tab (name string);
alter table tab rename to newName;
{code}

Fails:

{code}
InvalidOperationException(message:Unable to access old location 
hdfs://localhost:8020/tmp/tab for table x.tab)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34197: HIVE-10706 Make vectorized_timestamp_funcs test more stable

2015-05-14 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34197/
---

Review request for hive and Jason Dere.


Bugs: HIVE-10706
https://issues.apache.org/jira/browse/HIVE-10706


Repository: hive-git


Description
---

HIVE-10706 Make vectorized_timestamp_funcs test more stable


Diffs
-

  ql/src/test/queries/clientpositive/vectorized_timestamp_funcs.q 
8a2d5aaf5fb0396e551bdefdde507d1e9902919b 
  ql/src/test/results/clientpositive/spark/vectorized_timestamp_funcs.q.out 
304458215b4dcbc4d49321ba5f14ca5a87f2ec26 
  ql/src/test/results/clientpositive/tez/vectorized_timestamp_funcs.q.out 
fa3ed21232004d710b33cadac66680eabaca2c8a 
  ql/src/test/results/clientpositive/vectorized_timestamp_funcs.q.out 
31a96c68b22bd5332fb71b52982de71710df65fa 

Diff: https://reviews.apache.org/r/34197/diff/


Testing
---


Thanks,

Alexander Pivovarov



[jira] [Created] (HIVE-10707) CBO: debug logging OOMs

2015-05-14 Thread Gopal V (JIRA)
Gopal V created HIVE-10707:
--

 Summary: CBO: debug logging OOMs
 Key: HIVE-10707
 URL: https://issues.apache.org/jira/browse/HIVE-10707
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Gopal V
Priority: Trivial


{code}
hive source xcross.sql;
OK
Time taken: 0.837 seconds
Exception in thread main java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.ql.parse.ASTNode.dump(ASTNode.java:111)
at org.apache.hadoop.hive.ql.parse.ASTNode.dump(ASTNode.java:119)
at org.apache.hadoop.hive.ql.parse.ASTNode.dump(ASTNode.java:119)
at org.apache.hadoop.hive.ql.parse.ASTNode.dump(ASTNode.java:119)
at org.apache.hadoop.hive.ql.parse.ASTNode.dump(ASTNode.java:119)
{code}

The query contains 360 join clauses, wrapped in a UNION ALL.

Looks like {{genOpTree}} does 

{code}
  this.ctx.setCboInfo(Plan optimized by CBO.);
  this.ctx.setCboSucceeded(true);
  LOG.debug(newAST.dump());
  }
{code}

the debug logging OOMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33881: HIVE-10623 Implement hive cli options using beeline functionality

2015-05-14 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33881/#review83724
---



beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java
https://reviews.apache.org/r/33881/#comment134770

You can use IOUtils.closeQuietly(bw)
I do not think we need to log close buffer error


- Alexander Pivovarov


On May 14, 2015, 5:51 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33881/
 ---
 
 (Updated May 14, 2015, 5:51 a.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Bugs: HIVE-10623
 https://issues.apache.org/jira/browse/HIVE-10623
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Changes:
 1. Support the hive cli options including database, e, !, H, f.
 2. Add error handler for using f and e together
 3. Add error handler for invalid option
 
 
 Diffs
 -
 
   beeline/src/java/org/apache/hive/beeline/BeeLine.java 0da15f6 
   beeline/src/java/org/apache/hive/beeline/cli/CliOptionsProcessor.java 
 PRE-CREATION 
   beeline/src/java/org/apache/hive/beeline/cli/HiveCli.java PRE-CREATION 
   beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java PRE-CREATION 
   beeline/src/test/resources/hive-site.xml PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/33881/diff/
 
 
 Testing
 ---
 
 Newly add unit test passed locally.
 
 
 Thanks,
 
 cheng xu
 




[jira] [Created] (HIVE-10718) Update committer list - Add Ferdinand Xu

2015-05-14 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-10718:
---

 Summary: Update committer list - Add Ferdinand Xu
 Key: HIVE-10718
 URL: https://issues.apache.org/jira/browse/HIVE-10718
 Project: Hive
  Issue Type: Bug
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor


NO PRECOMMIT TESTS
add myself to committer list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Apache Hive 1.2.0 release candidate 5

2015-05-14 Thread Gunther Hagleitner
One more time, with feeling :-)

+1, same verification as last time.

From: Vikram Dixit K vikram.di...@gmail.com
Sent: Thursday, May 14, 2015 6:51 PM
To: dev@hive.apache.org
Cc: hive-...@hadoop.apache.org
Subject: Re: [VOTE] Apache Hive 1.2.0 release candidate 5

I built against hadoop1 and hadoop2 and ran the rat tool as well. Ran
a couple of queries.

+1

Thanks
Vikram.

On Thu, May 14, 2015 at 6:30 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 Hi Folks,

 We've cleared all the blockers listed for 1.2.0 release, either
 committing them, or deferring out to an eventual 1.2.1 stabilization
 release. (Any deferrals were a result of discussion between myself and
 the committer responsible for the issue.) More details are available
 here : 
 https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status

 Apache Hive 1.2.0 Release Candidate 5 is available here:

 https://people.apache.org/~khorgath/releases/1.2.0_RC5/artifacts/

 My public key used for signing is as available from the hive
 committers key list : http://www.apache.org/dist/hive/KEYS

 Maven artifacts are available here:

 https://repository.apache.org/content/repositories/orgapachehive-1039

 Source tag for RC5 is up on the apache git repo as tag
 release-1.2.0-rc5 (Browseable view over at
 https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=76b90268084f529852396302884297b3c22fcf00
 )

 Since this has minimal changes from the previous RC, I would further
 request that this vote conclude in 20 hours(which is past the 72 hr
 time from the previous RC announcement) if we have enough +1s in the
 meanwhile.

 Hive PMC Members: Please test and vote.

 Thanks,
 -Sushanth



--
Nothing better than when appreciated for hard work.
-Mark


FAILED: IndexOutOfBoundsException Index: 3, Size: 3

2015-05-14 Thread lushuai2...@139.com
Hello,
   Man!
 
When I execute the hql ,for example:
 
FROM INPUT
INSERT OVERWRITE TABLE temp2 partition(dt='2015-05-15',type='type1')
SELECT key,ip, 
count(distinct(value)) as uv, 
count(distinct(val1)) as pv 
where ip='2015-05-13' and key='dstdomain1'
GROUP BY key,ip
INSERT OVERWRITE TABLE temp2 partition(dt='2015-05-15',type='type2')
SELECT key,ip, 
count(distinct(value)) as uv, 
count(distinct(val1)) as pv 
where ip='2015-05-13' and key='dstdomain'
GROUP BY key,ip;
 
Throw an exeception:
 
   FAILED: IndexOutOfBoundsException Index: 3, Size: 3
   15/05/15 09:50:39 [main]: ERROR 
ql.Driver: FAILED: IndexOutOfBoundsException Index: 3, Size: 3
   
java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
at 
java.util.ArrayList.rangeCheck(ArrayList.java:635)
at 
java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$ReduceSinkLineage.process(OpProcFactory.java:477)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
at 
org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56)
at 
org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
at 
org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
at 
org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
at 
org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:95)
at 
org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:182)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10216)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
at 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1158)
at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:1037)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at 
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
at 
org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at 
org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   

Review Request 34261: query on view results fails with table not found error if view is created with subquery alias (CTE).

2015-05-14 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34261/
---

Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

1. When a fully qualified identifier (db.tablename) is specified in the from 
clause we seems to resolve it against CTE aliases. This is wrong if table 
doesn't exist in catalog then we should fail.
2. If fully qualified name is not used in the from clause then 
a) we should first resolve the identifier against CTE aliases 
b) if identifier is not found in the CTE list then try to resolve against 
catalog.
3) Views: in unparsetranslator we treat CTE name as catalog table; this is a 
bug.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 4bb256d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 30c87ad 
  ql/src/test/queries/clientpositive/cteViews.q PRE-CREATION 
  ql/src/test/results/clientpositive/cteViews.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/34261/diff/


Testing
---


Thanks,

pengcheng xiong