Re: Plan: permanently move hive builds from bigtop

2014-04-21 Thread Szehon Ho
It looks great, thanks Lefty!


On Sun, Apr 20, 2014 at 2:22 PM, Lefty Leverenz leftylever...@gmail.comwrote:

 Nice doc, Szehon.  I did some minor editing so you might want to make sure
 I didn't introduce any errors.

 https://cwiki.apache.org/confluence/display/Hive/Hive+PTest2+Infrastructure

 -- Lefty


 On Sat, Apr 19, 2014 at 9:45 PM, Szehon Ho sze...@cloudera.com wrote:

  Migration is done, I updated the wiki to add all the details of the new
  setup:
 
 https://cwiki.apache.org/confluence/display/Hive/Hive+PTest2+Infrastructure
 
  New Jenkins URL to submit pre-commit jobs:
 
 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/precommit-hive/
Again, this has to be done manually for time being, by clicking on
 'build
  with parameters', and entering the issue number as a parameter.  I've
  submitted some already.  I'll reach out to some committers to get the
  auto-trigger working.
 
  As I mentioned, there is some work to fix the test-reporting, due to the
  framework using old url scheme.  I am tracking it at
  HIVE-6937https://issues.apache.org/jira/browse/HIVE-6937.
   For now I am hosting log directory separately, if you want to see test
  logs, you have to manually go the url corresponding to your build, like:
 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/precommit-hive-11/for
  run#11.  Sorry about that.
 
  Let me know if you see other issues, thanks!
  Szehon
 
 
  On Fri, Apr 18, 2014 at 2:11 PM, Thejas Nair the...@hortonworks.com
  wrote:
 
   Sounds good.
   Thanks Szehon!
  
  
   On Fri, Apr 18, 2014 at 10:17 AM, Ashutosh Chauhan 
 hashut...@apache.org
  
   wrote:
+1 Thanks Szehon!
   
   
On Fri, Apr 18, 2014 at 6:29 AM, Xuefu Zhang xzh...@cloudera.com
   wrote:
   
+1. Thanks for taking care of this.
   
   
On Thu, Apr 17, 2014 at 11:00 PM, Szehon Ho sze...@cloudera.com
   wrote:
   
 Hi,

 This week the machine running Hive builds at
 http://bigtop01.cloudera.org:8080/view/Hive/?  ran out of space,
 so
   new
 jobs like Precommit tests stopped.  Its still not resolved there,
   there
was
 another email today on Bigtop list, but there's very few people
 with
   root
 access to that host, and they still haven't responded.

 I chatted with Brock, he has also seen various issues with Bigtop
   jenkins
 in the past, so I am thinking to move the Jenkins jobs to the
 PTest
master
 itself, where some PMC already have access and can admin if
 needed.
  Currently I am hosting the pre-commit Jenkins job on my own EC2
   instance
 as stop-gap.

 Other advantages of hosting our own Jenkins:
 1. No need to wait for other Bigtop jobs to run.
 2. Bigtop is using a version of Jenkins that doesnt show
 parameters
   like
 JIRA number for queued jobs, so impossible to tell whether a patch
  got
 picked up and where it is in queue.
 3. Eliminate network hop from Bigtop box to our PTest master.

 The disadvantage is:
 1. We don't have much experience doing Jenkins admin, but it
 doesn't
   look
 too bad.  Mostly, restart if there's issue and clean up if out of
   space.

 I wonder what people think, and if there's any objections to this?
   If
not,
 I'll try setting up this weekend.  Then, there is some follow-up
  work,
like
 changing the Jenkins url's displayed in the test report.

 Thanks!
 Szehon

   
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
 entity
  to
   which it is addressed and may contain information that is confidential,
   privileged and exempt from disclosure under applicable law. If the
 reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
  
 



Pre commit tests on Hadoop-2

2014-04-21 Thread Szehon Ho
As per discussion with Ashutosh and Brock, its also decided to make the new
PreCommit build use Hadoop-2 profile, as future deveopments will be focused
on there, and its better to do it quickly now rather than let the hadoop-2
q.out files drift again.

Thanks to all who helped fix the hundreds of hadoop-2 test failures, but
when I ran last night there were still 43 failed tests.  So you may see
these failures in your Precommit results, let's see if we can tackle these
quickly.

Thanks!
Szehon


2014-04-20 06:31:40,768  WARN PTest.run:202 43 failed tests


2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
2014-04-20 06:31:40,768  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database
2014-04-20 06:31:40,769  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1
2014-04-20 06:31:40,770  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10
2014-04-20 06:31:40,770  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_12
2014-04-20 06:31:40,770  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_13
2014-04-20 06:31:40,770  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_14
2014-04-20 06:31:40,770  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_17
2014-04-20 06:31:40,770  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_19
2014-04-20 06:31:40,770  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_20
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_21
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_22
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_23
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_24
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_4
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9
2014-04-20 06:31:40,771  WARN PTest.run:205
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
2014-04-20 

[jira] [Reopened] (HIVE-6936) Provide table properties to InputFormats

2014-04-21 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reopened HIVE-6936:
-


 Provide table properties to InputFormats
 

 Key: HIVE-6936
 URL: https://issues.apache.org/jira/browse/HIVE-6936
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.14.0

 Attachments: HIVE-6936.patch


 Some advanced file formats need the table properties made available to them. 
 Additionally, it would be convenient to provide a unique id for fetch 
 operators and the complete list of directories.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (HIVE-6936) Provide table properties to InputFormats

2014-04-21 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6936:


Comment: was deleted

(was: Pushed compilation fix to condorM30-0.11.0 as 41089f18.)

 Provide table properties to InputFormats
 

 Key: HIVE-6936
 URL: https://issues.apache.org/jira/browse/HIVE-6936
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.14.0

 Attachments: HIVE-6936.patch


 Some advanced file formats need the table properties made available to them. 
 Additionally, it would be convenient to provide a unique id for fetch 
 operators and the complete list of directories.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6936) Provide table properties to InputFormats

2014-04-21 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975629#comment-13975629
 ] 

Owen O'Malley commented on HIVE-6936:
-

Sorry, I closed the wrong bug!

 Provide table properties to InputFormats
 

 Key: HIVE-6936
 URL: https://issues.apache.org/jira/browse/HIVE-6936
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.14.0

 Attachments: HIVE-6936.patch


 Some advanced file formats need the table properties made available to them. 
 Additionally, it would be convenient to provide a unique id for fetch 
 operators and the complete list of directories.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Daniel Weeks (JIRA)
Daniel Weeks created HIVE-6938:
--

 Summary: Add Support for Parquet Column Rename
 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks


Parquet was originally introduced without 'replace columns' support in ql.  In 
addition, the default behavior for parquet is to access columns by name as 
opposed to by index by the Serde.  

Parquet should allow for either columnar (index based) access or name based 
access because it can support either.







--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Daniel Weeks (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Weeks updated HIVE-6938:
---

Status: Patch Available  (was: Open)

The patch contains a small change to DDLTask to add support for replace columns 
as well as a change to the Serde to allow switching between column index based 
access and name based access of columns.

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Daniel Weeks (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Weeks updated HIVE-6938:
---

Attachment: HIVE-6938.1.patch

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6807) add HCatStorer ORC test to test missing columns

2014-04-21 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6807:
-

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Thanks Eugene for correcting my misread of the patch.  Patch checked in.

 add HCatStorer ORC test to test missing columns
 ---

 Key: HIVE-6807
 URL: https://issues.apache.org/jira/browse/HIVE-6807
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.14.0

 Attachments: HIVE-6807.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975734#comment-13975734
 ] 

Julien Le Dem commented on HIVE-6938:
-

I find the terminology columnar.access confusing but otherwise, this looks 
good to me.

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6927) Add support for MSSQL in schematool

2014-04-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975747#comment-13975747
 ] 

Ashutosh Chauhan commented on HIVE-6927:


+1

 Add support for MSSQL in schematool
 ---

 Key: HIVE-6927
 URL: https://issues.apache.org/jira/browse/HIVE-6927
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-6927.patch


 Schematool is the preferred way of initializing schema for Hive. Since 
 HIVE-6862 provided the script for MSSQL it would be nice to add the support 
 for it in schematool.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Daniel Weeks (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975746#comment-13975746
 ] 

Daniel Weeks commented on HIVE-6938:


Confusion is understandable considering parquet is columnar.  How about 
column.index.access?

I'll update the patch.

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing

2014-04-21 Thread Nick White (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975767#comment-13975767
 ] 

Nick White commented on HIVE-538:
-

[~ashutoshc] not really, it manually lists some dependencies (not the 
transitive ones) instead of using maven to work them out, and creates a tar.gz 
of many jars, not a single jar with all the dependencies in. A tar.gz can't 
easily integrate with maven; it's easy to add this complete jar as a dependency 
to a third-party maven project as it's published with a distinct classifier.

 make hive_jdbc.jar self-containing
 --

 Key: HIVE-538
 URL: https://issues.apache.org/jira/browse/HIVE-538
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0
Reporter: Raghotham Murthy
Assignee: Nick White
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch


 Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are 
 required in the classpath to run jdbc applications on hive. We need to do 
 atleast the following to get rid of most unnecessary dependencies:
 1. get rid of dynamic serde and use a standard serialization format, maybe 
 tab separated, json or avro
 2. dont use hadoop configuration parameters
 3. repackage thrift and fb303 classes into hive_jdbc.jar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing

2014-04-21 Thread Nick White (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975771#comment-13975771
 ] 

Nick White commented on HIVE-538:
-

also, duplicating hive-jdbc's dependencies in an xml file in a different 
project will increase maintenance costs, as these two lists will have to be 
manually kept in sync.

 make hive_jdbc.jar self-containing
 --

 Key: HIVE-538
 URL: https://issues.apache.org/jira/browse/HIVE-538
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0
Reporter: Raghotham Murthy
Assignee: Nick White
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch


 Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are 
 required in the classpath to run jdbc applications on hive. We need to do 
 atleast the following to get rid of most unnecessary dependencies:
 1. get rid of dynamic serde and use a standard serialization format, maybe 
 tab separated, json or avro
 2. dont use hadoop configuration parameters
 3. repackage thrift and fb303 classes into hive_jdbc.jar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6916) Export/import inherit permissions from parent directory

2014-04-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6916:
--

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks to Szehon for the patch.

 Export/import inherit permissions from parent directory
 ---

 Key: HIVE-6916
 URL: https://issues.apache.org/jira/browse/HIVE-6916
 Project: Hive
  Issue Type: Bug
  Components: Security
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 0.14.0

 Attachments: HIVE-6916.2.patch, HIVE-6916.patch


 Export table into an external location and importing into hive, should set 
 the table to have the permission of the parent directory, if the flag 
 hive.warehouse.subdir.inherit.perms is set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler

2014-04-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975809#comment-13975809
 ] 

Xuefu Zhang commented on HIVE-6411:
---

I believe there are some minor items to be resolved on review board.

 Support more generic way of using composite key for HBaseHandler
 

 Key: HIVE-6411
 URL: https://issues.apache.org/jira/browse/HIVE-6411
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.2.patch.txt, 
 HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, 
 HIVE-6411.6.patch.txt, HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, 
 HIVE-6411.9.patch.txt


 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975824#comment-13975824
 ] 

Xuefu Zhang commented on HIVE-6835:
---

I think that's pretty much what you need to do. While #2 may touch many files, 
it's fairly safe as #1 guarantees that the same code will be exercised. There 
isn't much API change. You add one with default implementation and deprecate 
the old one. In #3, you have both property sets and do whatever you need for 
Avro.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975851#comment-13975851
 ] 

Julien Le Dem commented on HIVE-6938:
-

Sounds good to me!

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton

2014-04-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975889#comment-13975889
 ] 

Eugene Koifman commented on HIVE-5072:
--

[~shuainie] please file the 2 follow up tickets for doc and version/sqoop 
issues.

 [WebHCat]Enable directly invoke Sqoop job through Templeton
 ---

 Key: HIVE-5072
 URL: https://issues.apache.org/jira/browse/HIVE-5072
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, 
 HIVE-5072.4.patch, HIVE-5072.5.patch, Templeton-Sqoop-Action.pdf


 Now it is hard to invoke a Sqoop job through templeton. The only way is to 
 use the classpath jar generated by a sqoop job and use the jar delegator in 
 Templeton. We should implement Sqoop Delegator to enable directly invoke 
 Sqoop job through Templeton.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2

2014-04-21 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6939:


 Summary: TestExecDriver.testMapRedPlan3 fails on hadoop-2
 Key: HIVE-6939
 URL: https://issues.apache.org/jira/browse/HIVE-6939
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere


Passes on hadoop-1, but fails on hadoop-2.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action

2014-04-21 Thread Shuaishuai Nie (JIRA)
Shuaishuai Nie created HIVE-6940:


 Summary: [WebHCat]Update documentation for Templeton-Sqoop action
 Key: HIVE-6940
 URL: https://issues.apache.org/jira/browse/HIVE-6940
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Shuaishuai Nie


WebHCat documentation need to be updated based on the new feature introduced in 
HIVE-5072



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6941) [WebHCat] Implement webhcat endpoint version/sqoop

2014-04-21 Thread Shuaishuai Nie (JIRA)
Shuaishuai Nie created HIVE-6941:


 Summary: [WebHCat] Implement webhcat endpoint version/sqoop
 Key: HIVE-6941
 URL: https://issues.apache.org/jira/browse/HIVE-6941
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Reporter: Shuaishuai Nie


Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should 
also expose endpoint version/sqoop to return the version of Sqoop used.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator

2014-04-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6901:
--

Attachment: HIVE-6901.2.patch

 Explain plan doesn't show operator tree for the fetch operator
 --

 Key: HIVE-6901
 URL: https://issues.apache.org/jira/browse/HIVE-6901
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, 
 HIVE-6901.patch


 Explaining a simple select query that involves a MR phase doesn't show 
 processor tree for the fetch operator.
 {code}
 hive explain select d from test;
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
 ...
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {code}
 It would be nice if the operator tree is shown even if there is only one node.
 Please note that in local execution, the operator tree is complete:
 {code}
 hive explain select * from test;
 OK
 STAGE DEPENDENCIES:
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-0
 Fetch Operator
   limit: -1
   Processor Tree:
 TableScan
   alias: test
   Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column 
 stats: NONE
   Select Operator
 expressions: d (type: int)
 outputColumnNames: _col0
 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE 
 Column stats: NONE
 ListSink
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Daniel Weeks (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975941#comment-13975941
 ] 

Daniel Weeks commented on HIVE-6938:


Patch #2 has the disambiguated property name.

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename

2014-04-21 Thread Daniel Weeks (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Weeks updated HIVE-6938:
---

Attachment: HIVE-6938.2.patch

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator

2014-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975950#comment-13975950
 ] 

Hive QA commented on HIVE-6901:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12641106/HIVE-6901.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/precommit-hive/17/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/precommit-hive/17/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: NullPointerException: driver
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12641106

 Explain plan doesn't show operator tree for the fetch operator
 --

 Key: HIVE-6901
 URL: https://issues.apache.org/jira/browse/HIVE-6901
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, 
 HIVE-6901.patch


 Explaining a simple select query that involves a MR phase doesn't show 
 processor tree for the fetch operator.
 {code}
 hive explain select d from test;
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
 ...
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {code}
 It would be nice if the operator tree is shown even if there is only one node.
 Please note that in local execution, the operator tree is complete:
 {code}
 hive explain select * from test;
 OK
 STAGE DEPENDENCIES:
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-0
 Fetch Operator
   limit: -1
   Processor Tree:
 TableScan
   alias: test
   Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column 
 stats: NONE
   Select Operator
 expressions: d (type: int)
 outputColumnNames: _col0
 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE 
 Column stats: NONE
 ListSink
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6942) Explanation of GROUPING__ID is confusing

2014-04-21 Thread chris schrader (JIRA)
chris schrader created HIVE-6942:


 Summary: Explanation of GROUPING__ID is confusing
 Key: HIVE-6942
 URL: https://issues.apache.org/jira/browse/HIVE-6942
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: chris schrader
Priority: Minor


The explanation given for GROUPING__ID in enhanced aggregations is very 
incomplete and confusing based on the example.  Documentation here: 

https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup#EnhancedAggregation,Cube,GroupingandRollup-Grouping__IDfunction

It would be far easier to understand if the bit vector were explained better 
along side the examples given.  IE, also explain identifying each column in 
terms of the binary number it returns and then show it converted to decimal.  
In the examples provided, the binary equivalent of the grouping ID's for the 
first example would be 1,11,11 representing the columns included in 
aggregation.  The documentation is very confusing without this clear connection 
between creating a binary number that gets converted (just referring to it as a 
bitvector isn't sufficient to the average user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2

2014-04-21 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975990#comment-13975990
 ] 

Jason Dere commented on HIVE-6939:
--

It appears that on hadoop-2 there are multiple output files for this test, 
whereas on hadoop-1 there is just a single file.  Also the test looks like it 
assumes there is just a single file to compare to.
This test for some reason is specifying 5 reducers.  I've been told that 
hadoop-1 Hive did not obey the number of reducers, while hadoop-2 does.  This 
could explain why this test works for hadoop-1 since if it only ever used 1 
reducer and generated a single output file.

Changing this test to use just a single reducer allows the test to pass with 
hadoop-2.  Does anyone know the history of this test and why it was set to use 
5 reducers?

 TestExecDriver.testMapRedPlan3 fails on hadoop-2
 

 Key: HIVE-6939
 URL: https://issues.apache.org/jira/browse/HIVE-6939
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere

 Passes on hadoop-1, but fails on hadoop-2.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator

2014-04-21 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6901:


Attachment: HIVE-6901.2.patch

 Explain plan doesn't show operator tree for the fetch operator
 --

 Key: HIVE-6901
 URL: https://issues.apache.org/jira/browse/HIVE-6901
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, 
 HIVE-6901.2.patch, HIVE-6901.patch


 Explaining a simple select query that involves a MR phase doesn't show 
 processor tree for the fetch operator.
 {code}
 hive explain select d from test;
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
 ...
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {code}
 It would be nice if the operator tree is shown even if there is only one node.
 Please note that in local execution, the operator tree is complete:
 {code}
 hive explain select * from test;
 OK
 STAGE DEPENDENCIES:
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-0
 Fetch Operator
   limit: -1
   Processor Tree:
 TableScan
   alias: test
   Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column 
 stats: NONE
   Select Operator
 expressions: d (type: int)
 outputColumnNames: _col0
 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE 
 Column stats: NONE
 ListSink
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2

2014-04-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6939:
-

Status: Patch Available  (was: Open)

 TestExecDriver.testMapRedPlan3 fails on hadoop-2
 

 Key: HIVE-6939
 URL: https://issues.apache.org/jira/browse/HIVE-6939
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6939.1.patch


 Passes on hadoop-1, but fails on hadoop-2.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2

2014-04-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6939:
-

Attachment: HIVE-6939.1.patch

Patch changes test to use single reducer so that there is just a single output 
file.

 TestExecDriver.testMapRedPlan3 fails on hadoop-2
 

 Key: HIVE-6939
 URL: https://issues.apache.org/jira/browse/HIVE-6939
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6939.1.patch


 Passes on hadoop-1, but fails on hadoop-2.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator

2014-04-21 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975991#comment-13975991
 ] 

Szehon Ho commented on HIVE-6901:
-

Sorry about that, there was a problem with the hadoop-2 test-property file on 
the build machine, I'll re-submit this.

 Explain plan doesn't show operator tree for the fetch operator
 --

 Key: HIVE-6901
 URL: https://issues.apache.org/jira/browse/HIVE-6901
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, 
 HIVE-6901.2.patch, HIVE-6901.patch


 Explaining a simple select query that involves a MR phase doesn't show 
 processor tree for the fetch operator.
 {code}
 hive explain select d from test;
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
 ...
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {code}
 It would be nice if the operator tree is shown even if there is only one node.
 Please note that in local execution, the operator tree is complete:
 {code}
 hive explain select * from test;
 OK
 STAGE DEPENDENCIES:
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-0
 Fetch Operator
   limit: -1
   Processor Tree:
 TableScan
   alias: test
   Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column 
 stats: NONE
   Select Operator
 expressions: d (type: int)
 outputColumnNames: _col0
 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE 
 Column stats: NONE
 ListSink
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6541) Need to write documentation for ACID work

2014-04-21 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976014#comment-13976014
 ] 

Alan Gates commented on HIVE-6541:
--

I've added page links for [Hive 
Transactions|https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions]
 and [Streaming 
Ingest|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] 
and linked the Hive Transactions page in the User docs section of the wiki home 
page.  Feel free to edit or move these docs around if you think they would be 
better placed somewhere else.

 Need to write documentation for ACID work
 -

 Key: HIVE-6541
 URL: https://issues.apache.org/jira/browse/HIVE-6541
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: hive-6541-changesAfterFirstEdit.rtf, 
 hive-6541-firstEdit.rtf, hive-6541.txt


 ACID introduces a number of new config file options, tables in the metastore, 
 keywords in the grammar, and a new interface for use of tools like storm and 
 flume.  These need to be documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-6541) Need to write documentation for ACID work

2014-04-21 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates resolved HIVE-6541.
--

Resolution: Fixed

 Need to write documentation for ACID work
 -

 Key: HIVE-6541
 URL: https://issues.apache.org/jira/browse/HIVE-6541
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: hive-6541-changesAfterFirstEdit.rtf, 
 hive-6541-firstEdit.rtf, hive-6541.txt


 ACID introduces a number of new config file options, tables in the metastore, 
 keywords in the grammar, and a new interface for use of tools like storm and 
 flume.  These need to be documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6943:
---

Summary: TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on 
hadoop-2  (was: TestMinimrCliDriver.testCliDriver_root_dir_external_table)

 TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2
 ---

 Key: HIVE-6943
 URL: https://issues.apache.org/jira/browse/HIVE-6943
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashutosh Chauhan

 Seems like this test passes for hadoop-1 but is flaky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table

2014-04-21 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-6943:
--

 Summary: TestMinimrCliDriver.testCliDriver_root_dir_external_table
 Key: HIVE-6943
 URL: https://issues.apache.org/jira/browse/HIVE-6943
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashutosh Chauhan


Seems like this test passes for hadoop-1 but is flaky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976064#comment-13976064
 ] 

Ashutosh Chauhan commented on HIVE-6943:


Stack trace I got is:
{code}
Error: java.io.IOException: java.lang.reflect.InvocationTargetException
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:302)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.init(HadoopShimsSecure.java:249)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:363)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:591)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:168)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:288)
... 11 more
Caused by: java.io.FileNotFoundException: Path is not a file: /Users
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:51)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1627)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1570)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1550)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1524)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:476)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:289)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at 
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1133)
at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1121)
at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:)
at 
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:272)
at 

[jira] [Commented] (HIVE-6924) MapJoinKeyBytes::hashCode() should use Murmur hash

2014-04-21 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976081#comment-13976081
 ] 

Jitendra Nath Pandey commented on HIVE-6924:


VectorHashKeyWrapper also uses Arrays.hashCode(). VectorHashKeyWrapper is used 
in VectorGroupByOperator for keys to the aggregates. Murmur hash should help 
there as well.

 MapJoinKeyBytes::hashCode() should use Murmur hash
 --

 Key: HIVE-6924
 URL: https://issues.apache.org/jira/browse/HIVE-6924
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6924.patch


 Existing hashCode is bad, causes HashMap to cluster



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-6944:


 Summary: WebHCat e2e tests broken by HIVE-6432
 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6944:
-

Description: 
HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests

NO PRECOMMIT TESTS

  was:HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e 
tests


 WebHCat e2e tests broken by HIVE-6432
 -

 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6944:
-

Attachment: HIVE-6944.patch

 WebHCat e2e tests broken by HIVE-6432
 -

 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6944.patch


 HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6944:
-

Status: Patch Available  (was: Open)

 WebHCat e2e tests broken by HIVE-6432
 -

 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6944.patch


 HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4576) templeton.hive.properties does not allow values with commas

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4576:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Eugene!

 templeton.hive.properties does not allow values with commas
 ---

 Key: HIVE-4576
 URL: https://issues.apache.org/jira/browse/HIVE-4576
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.5.0
Reporter: Vitaliy Fuks
Assignee: Eugene Koifman
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-4576.2.patch, HIVE-4576.patch


 templeton.hive.properties accepts a comma-separated list of key=value 
 property pairs that will be passed to Hive.
 However, this makes it impossible to use any value that itself has a comma 
 in it.
 For example:
 {code:xml}property
   nametempleton.hive.properties/name
   
 valuehive.metastore.sasl.enabled=false,hive.metastore.uris=thrift://foo1.example.com:9083,foo2.example.com:9083/value
 /property{code}
 {noformat}templeton: starting [/usr/bin/hive, --service, cli, --hiveconf, 
 hive.metastore.sasl.enabled=false, --hiveconf, 
 hive.metastore.uris=thrift://foo1.example.com:9083, --hiveconf, 
 foo2.example.com:9083 etc..{noformat}
 because the value is parsed using standard 
 org.apache.hadoop.conf.Configuration.getStrings() call which simply splits on 
 commas from here:
 {code:java}for (String prop : 
 appConf.getStrings(AppConfig.HIVE_PROPS_NAME)){code}
 This is problematic for any hive property that itself has multiple values, 
 such as hive.metastore.uris above or hive.aux.jars.path.
 There should be some way to escape commas or a different delimiter should 
 be used.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2

2014-04-21 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976101#comment-13976101
 ] 

Jason Dere commented on HIVE-6943:
--

Had already opened HIVE-6401 for this one - looks like hadoop-2 is returning 
back directories during getSplits()

 TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2
 ---

 Key: HIVE-6943
 URL: https://issues.apache.org/jira/browse/HIVE-6943
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashutosh Chauhan

 Seems like this test passes for hadoop-1 but is flaky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-6943.


Resolution: Duplicate

Oh, missed that one. Resolving this one as dupe of HIVE-6401

 TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2
 ---

 Key: HIVE-6943
 URL: https://issues.apache.org/jira/browse/HIVE-6943
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashutosh Chauhan

 Seems like this test passes for hadoop-1 but is flaky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6945) issues with dropping partitions on Oracle

2014-04-21 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-6945:
--

 Summary: issues with dropping partitions on Oracle
 Key: HIVE-6945
 URL: https://issues.apache.org/jira/browse/HIVE-6945
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[ANNOUNCE] Apache Hive 0.13.0 Released

2014-04-21 Thread Harish Butani
The Apache Hive team is proud to announce the the release of Apache
Hive version 0.13.0.

The Apache Hive (TM) data warehouse software facilitates querying and
managing large datasets residing in distributed storage. Built on top
of Apache Hadoop (TM), it provides:

* Tools to enable easy data extract/transform/load (ETL)

* A mechanism to impose structure on a variety of data formats

* Access to files stored either directly in Apache HDFS (TM) or in other
  data storage systems such as Apache HBase (TM)

* Query execution via MapReduce

For Hive release details and downloads, please visit:
http://www.apache.org/dyn/closer.cgi/hive/

Hive 0.13.0 Release Notes are available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843

We would like to thank the many contributors who made this release
possible.

Regards,

The Apache Hive Team

PS: we are having technical difficulty updating the website. Will resolve
this shortly.


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-21 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976144#comment-13976144
 ] 

Anthony Hsu commented on HIVE-6835:
---

Great, sounds like we're on the same page. I'll implement this new approach and 
upload a new patch soon.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6940:
-

Component/s: Documentation

 [WebHCat]Update documentation for Templeton-Sqoop action
 

 Key: HIVE-6940
 URL: https://issues.apache.org/jira/browse/HIVE-6940
 Project: Hive
  Issue Type: Bug
  Components: Documentation, WebHCat
Affects Versions: 0.14.0
Reporter: Shuaishuai Nie

 WebHCat documentation need to be updated based on the new feature introduced 
 in HIVE-5072



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6940:
-

Affects Version/s: 0.14.0

 [WebHCat]Update documentation for Templeton-Sqoop action
 

 Key: HIVE-6940
 URL: https://issues.apache.org/jira/browse/HIVE-6940
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Shuaishuai Nie

 WebHCat documentation need to be updated based on the new feature introduced 
 in HIVE-5072



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton

2014-04-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976163#comment-13976163
 ] 

Eugene Koifman commented on HIVE-5072:
--

Verified that the tests run.

[~shuainie] Thank you for filing the tickets but I think they need to have more 
than 1 line description.  HIVE-6541 has an example of what to provide to 
Documentation writers.

 [WebHCat]Enable directly invoke Sqoop job through Templeton
 ---

 Key: HIVE-5072
 URL: https://issues.apache.org/jira/browse/HIVE-5072
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, 
 HIVE-5072.4.patch, HIVE-5072.5.patch, Templeton-Sqoop-Action.pdf


 Now it is hard to invoke a Sqoop job through templeton. The only way is to 
 use the classpath jar generated by a sqoop job and use the jar delegator in 
 Templeton. We should implement Sqoop Delegator to enable directly invoke 
 Sqoop job through Templeton.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6945) issues with dropping partitions on Oracle

2014-04-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976162#comment-13976162
 ] 

Xuefu Zhang commented on HIVE-6945:
---

Could we have some description about what issues are in focus here? The title 
alone doesn't seem providing any essential information that help the readers.

 issues with dropping partitions on Oracle
 -

 Key: HIVE-6945
 URL: https://issues.apache.org/jira/browse/HIVE-6945
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin





--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [ANNOUNCE] Apache Hive 0.13.0 Released

2014-04-21 Thread Thejas Nair
Thanks to Harish for all the hard work managing and getting the release out!

This is great news! This is a significant release in hive! This has
more than twice the number of jiras included (see release note link),
compared to 0.12, and earlier releases which were also out after a
similar gap of 5-6 months. It shows tremendous growth in hive
community activity!

hive 0.13 - 1081
hive 0.12 - 439
hive 0.11 - 374

-Thejas

On Mon, Apr 21, 2014 at 3:17 PM, Harish Butani rhbut...@apache.org wrote:
 The Apache Hive team is proud to announce the the release of Apache
 Hive version 0.13.0.

 The Apache Hive (TM) data warehouse software facilitates querying and
 managing large datasets residing in distributed storage. Built on top
 of Apache Hadoop (TM), it provides:

 * Tools to enable easy data extract/transform/load (ETL)

 * A mechanism to impose structure on a variety of data formats

 * Access to files stored either directly in Apache HDFS (TM) or in other
   data storage systems such as Apache HBase (TM)

 * Query execution via MapReduce

 For Hive release details and downloads, please visit:
 http://www.apache.org/dyn/closer.cgi/hive/

 Hive 0.13.0 Release Notes are available here:
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843

 We would like to thank the many contributors who made this release
 possible.

 Regards,

 The Apache Hive Team

 PS: we are having technical difficulty updating the website. Will resolve
 this shortly.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976176#comment-13976176
 ] 

Sushanth Sowmyan commented on HIVE-6944:


+1 , will commit after the 24h period. :)

 WebHCat e2e tests broken by HIVE-6432
 -

 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6944.patch


 HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6946) Make it easier to run WebHCat tesst

2014-04-21 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-6946:


 Summary: Make it easier to run WebHCat tesst
 Key: HIVE-6946
 URL: https://issues.apache.org/jira/browse/HIVE-6946
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set 
up WebHCat e2e tests but it's cumbersome and error prone.  Need to make some 
improvements here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6946:
-

Summary: Make it easier to run WebHCat e2e tests  (was: Make it easier to 
run WebHCat tesst)

 Make it easier to run WebHCat e2e tests
 ---

 Key: HIVE-6946
 URL: https://issues.apache.org/jira/browse/HIVE-6946
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to 
 set up WebHCat e2e tests but it's cumbersome and error prone.  Need to make 
 some improvements here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-4824) make TestWebHCatE2e run w/o requiring installing external hadoop

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-4824.
--

Resolution: Won't Fix

real unit tests for webhcat would take too much effort so will instead simplify 
running existing e2e tests in HIVE-6946

 make TestWebHCatE2e run w/o requiring installing external hadoop
 

 Key: HIVE-4824
 URL: https://issues.apache.org/jira/browse/HIVE-4824
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 Currently WebHCat will use hive/build/dist/hcatalog/bin/hcat to execute DDL 
 commands, which in turn uses Hadoop Jar command.
 This in turn requires that HADOOP_HOME env var be defined and point to an 
 existing Hadoop install.  
 Need to see we can apply hive/testutils/hadoop idea here to make WebHCat not 
 depend on external hadoop.
 This will make Unit tests better/easier to write and make dev/test cycle 
 simpler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [ANNOUNCE] Apache Hive 0.13.0 Released

2014-04-21 Thread Harish Butani
The link to the Release Notes is wrong.
Thanks Szehon Ho for pointing this out.
The correct link is:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324986styleName=TextprojectId=12310843


On Mon, Apr 21, 2014 at 4:23 PM, Thejas Nair the...@hortonworks.com wrote:

 Thanks to Harish for all the hard work managing and getting the release
 out!

 This is great news! This is a significant release in hive! This has
 more than twice the number of jiras included (see release note link),
 compared to 0.12, and earlier releases which were also out after a
 similar gap of 5-6 months. It shows tremendous growth in hive
 community activity!

 hive 0.13 - 1081
 hive 0.12 - 439
 hive 0.11 - 374

 -Thejas

 On Mon, Apr 21, 2014 at 3:17 PM, Harish Butani rhbut...@apache.org
 wrote:
  The Apache Hive team is proud to announce the the release of Apache
  Hive version 0.13.0.
 
  The Apache Hive (TM) data warehouse software facilitates querying and
  managing large datasets residing in distributed storage. Built on top
  of Apache Hadoop (TM), it provides:
 
  * Tools to enable easy data extract/transform/load (ETL)
 
  * A mechanism to impose structure on a variety of data formats
 
  * Access to files stored either directly in Apache HDFS (TM) or in other
data storage systems such as Apache HBase (TM)
 
  * Query execution via MapReduce
 
  For Hive release details and downloads, please visit:
  http://www.apache.org/dyn/closer.cgi/hive/
 
  Hive 0.13.0 Release Notes are available here:
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843
 
  We would like to thank the many contributors who made this release
  possible.
 
  Regards,
 
  The Apache Hive Team
 
  PS: we are having technical difficulty updating the website. Will resolve
  this shortly.

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Updated] (HIVE-5538) Turn on vectorization by default.

2014-04-21 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-5538:
---

Status: Open  (was: Patch Available)

 Turn on vectorization by default.
 -

 Key: HIVE-5538
 URL: https://issues.apache.org/jira/browse/HIVE-5538
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch


   Vectorization should be turned on by default, so that users don't have to 
 specifically enable vectorization. 
   Vectorization code validates and ensures that a query falls back to row 
 mode if it is not supported on vectorized code path. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5538) Turn on vectorization by default.

2014-04-21 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-5538:
---

Attachment: HIVE-5538.3.patch

Many tests failed due to difference in explain output, which were due to data 
size stats. The attached patch fixes it.

 Turn on vectorization by default.
 -

 Key: HIVE-5538
 URL: https://issues.apache.org/jira/browse/HIVE-5538
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch


   Vectorization should be turned on by default, so that users don't have to 
 specifically enable vectorization. 
   Vectorization code validates and ensures that a query falls back to row 
 mode if it is not supported on vectorized code path. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5538) Turn on vectorization by default.

2014-04-21 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-5538:
---

Status: Patch Available  (was: Open)

 Turn on vectorization by default.
 -

 Key: HIVE-5538
 URL: https://issues.apache.org/jira/browse/HIVE-5538
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch


   Vectorization should be turned on by default, so that users don't have to 
 specifically enable vectorization. 
   Vectorization code validates and ensures that a query falls back to row 
 mode if it is not supported on vectorized code path. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator

2014-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976233#comment-13976233
 ] 

Hive QA commented on HIVE-6901:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12641113/HIVE-6901.2.patch

{color:red}ERROR:{color} -1 due to 122 failed/errored test(s), 5416 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binarysortable_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input39
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4

[jira] [Updated] (HIVE-6947) More fixes for tests on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6947:
---

Summary: More fixes for tests on hadoop-2   (was: More fixes for hadoop-2)

 More fixes for tests on hadoop-2 
 -

 Key: HIVE-6947
 URL: https://issues.apache.org/jira/browse/HIVE-6947
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ashutosh Chauhan

 Few more fixes for test cases on hadoop-2



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6947) More fixes for hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-6947:
--

 Summary: More fixes for hadoop-2
 Key: HIVE-6947
 URL: https://issues.apache.org/jira/browse/HIVE-6947
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ashutosh Chauhan


Few more fixes for test cases on hadoop-2



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6947) More fixes for tests on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6947:
---

Attachment: HIVE-6947.patch

Doesnt include file size change diffs.

 More fixes for tests on hadoop-2 
 -

 Key: HIVE-6947
 URL: https://issues.apache.org/jira/browse/HIVE-6947
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ashutosh Chauhan
 Attachments: HIVE-6947.patch


 Few more fixes for test cases on hadoop-2



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HIVE-6947) More fixes for tests on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6947 started by Ashutosh Chauhan.

 More fixes for tests on hadoop-2 
 -

 Key: HIVE-6947
 URL: https://issues.apache.org/jira/browse/HIVE-6947
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6947.patch


 Few more fixes for test cases on hadoop-2



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6947) More fixes for tests on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6947:
---

Status: Patch Available  (was: In Progress)

 More fixes for tests on hadoop-2 
 -

 Key: HIVE-6947
 URL: https://issues.apache.org/jira/browse/HIVE-6947
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6947.patch


 Few more fixes for test cases on hadoop-2



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-6947) More fixes for tests on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-6947:
--

Assignee: Ashutosh Chauhan

 More fixes for tests on hadoop-2 
 -

 Key: HIVE-6947
 URL: https://issues.apache.org/jira/browse/HIVE-6947
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6947.patch


 Few more fixes for test cases on hadoop-2



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6946:
-

Description: 
Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set 
up WebHCat e2e tests but it's cumbersome and error prone.  Need to make some 
improvements here.

NO PRECOMMIT TESTS

  was:Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps 
to set up WebHCat e2e tests but it's cumbersome and error prone.  Need to make 
some improvements here.


 Make it easier to run WebHCat e2e tests
 ---

 Key: HIVE-6946
 URL: https://issues.apache.org/jira/browse/HIVE-6946
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to 
 set up WebHCat e2e tests but it's cumbersome and error prone.  Need to make 
 some improvements here.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6927) Add support for MSSQL in schematool

2014-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6927:
---

Status: Patch Available  (was: Open)

 Add support for MSSQL in schematool
 ---

 Key: HIVE-6927
 URL: https://issues.apache.org/jira/browse/HIVE-6927
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-6927.patch


 Schematool is the preferred way of initializing schema for Hive. Since 
 HIVE-6862 provided the script for MSSQL it would be nice to add the support 
 for it in schematool.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6932) hive README needs update

2014-04-21 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6932:
-

Assignee: Thejas M Nair

 hive README needs update
 

 Key: HIVE-6932
 URL: https://issues.apache.org/jira/browse/HIVE-6932
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair

 It needs to be updated to include Tez as a runtime. Also, it talks about 
 average latency being in minutes, which is very misleading.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6932) hive README needs update

2014-04-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6932:


Description: 
It needs to be updated to include Tez as a runtime. Also, it talks about 
average latency being in minutes, which is very misleading.
NO PRECOMMIT TESTS


  was:
It needs to be updated to include Tez as a runtime. Also, it talks about 
average latency being in minutes, which is very misleading.



 hive README needs update
 

 Key: HIVE-6932
 URL: https://issues.apache.org/jira/browse/HIVE-6932
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6932.1.patch


 It needs to be updated to include Tez as a runtime. Also, it talks about 
 average latency being in minutes, which is very misleading.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6932) hive README needs update

2014-04-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6932:


Attachment: HIVE-6932.1.patch

 hive README needs update
 

 Key: HIVE-6932
 URL: https://issues.apache.org/jira/browse/HIVE-6932
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6932.1.patch


 It needs to be updated to include Tez as a runtime. Also, it talks about 
 average latency being in minutes, which is very misleading.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6932) hive README needs update

2014-04-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6932:


Status: Patch Available  (was: Open)

 hive README needs update
 

 Key: HIVE-6932
 URL: https://issues.apache.org/jira/browse/HIVE-6932
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6932.1.patch


 It needs to be updated to include Tez as a runtime. Also, it talks about 
 average latency being in minutes, which is very misleading.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2

2014-04-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976271#comment-13976271
 ] 

Ashutosh Chauhan commented on HIVE-6939:


Description of test says  test reduce with multiple tagged inputs. so I dont 
think it has any specific intention for # of reducers = 5. So, # of reducers = 
1, sounds good to me. +1

 TestExecDriver.testMapRedPlan3 fails on hadoop-2
 

 Key: HIVE-6939
 URL: https://issues.apache.org/jira/browse/HIVE-6939
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6939.1.patch


 Passes on hadoop-1, but fails on hadoop-2.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6946:
-

Status: Patch Available  (was: Open)

 Make it easier to run WebHCat e2e tests
 ---

 Key: HIVE-6946
 URL: https://issues.apache.org/jira/browse/HIVE-6946
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6946.patch


 Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to 
 set up WebHCat e2e tests but it's cumbersome and error prone.  Need to make 
 some improvements here.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6946:
-

Attachment: HIVE-6946.patch

 Make it easier to run WebHCat e2e tests
 ---

 Key: HIVE-6946
 URL: https://issues.apache.org/jira/browse/HIVE-6946
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6946.patch


 Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to 
 set up WebHCat e2e tests but it's cumbersome and error prone.  Need to make 
 some improvements here.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6469) skipTrash option in hive command line

2014-04-21 Thread Jayesh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976297#comment-13976297
 ] 

Jayesh commented on HIVE-6469:
--

Xuefu,

This is really miner convenient feature which definitely has a use-case for our 
enterprise customer.
are you suggesting providing this feature via hive configuration that works in 
following way ?

set hive.warehouse.data.skipTrash = true-- explicitly set
drop table large10TBTable   
-- this will skip trash
drop table anyOtherTable
-- this will skip trash
set hive.warehouse.data.skipTrash = false   -- if you forget this, 
it will skipTrash forever, until corrected.
drop table regularTable -- this will start 
placing data in trash

I believe that approach is not very intuitive and will lead to human error that 
creates disaster if necessary steps are not done, which ultimately violates 
hive feature of providing trash as backup.  

Also, different environment with different HS2 instance may not be the scenario 
here. This has proven to be very helpful on same environment by different users.

Also, I dont think this pollute SQL Syntax, think of this as PURGE option in 
Oracle DB and hence I totally see use it being used by enterprise customer.
http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9003.htm

Did you get a chance to look at the links I put earlier, where people seen to 
be searching for this little convenient feature ?
Also did you get a chance to talk to any customers who would like such feature? 
Please let us know.

Thanks
Jayesh

 skipTrash option in hive command line
 -

 Key: HIVE-6469
 URL: https://issues.apache.org/jira/browse/HIVE-6469
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.12.0
Reporter: Jayesh
 Fix For: 0.12.1

 Attachments: HIVE-6469.patch


 hive drop table command deletes the data from HDFS warehouse and puts it into 
 Trash.
 Currently there is no way to provide flag to tell warehouse to skip trash 
 while deleting table data.
 This ticket is to add skipTrash feature in hive command-line, that looks as 
 following. 
 hive -e drop table skipTrash testTable
 This would be good feature to add, so that user can specify when not to put 
 data into trash directory and thus not to fill hdfs space instead of relying 
 on trash interval and policy configuration to take care of disk filling issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Hive Contributor

2014-04-21 Thread Naveen Gangam
Dear Hive PMC,
I would like to contribute to the HIVE community. Could you please grant me
the contributor role?

My apache username is ngangam. Thank you in advance and I am looking
forward to becoming a part of the Hive community.

-- 

Thanks,
Naveen :)


[jira] [Created] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH

2014-04-21 Thread Peng Zhang (JIRA)
Peng Zhang created HIVE-6948:


 Summary: HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
 Key: HIVE-6948
 URL: https://issues.apache.org/jira/browse/HIVE-6948
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Peng Zhang
 Fix For: 0.13.0


HiveServer2 ignores HIVE_AUX_JARS_PATH.
This will cause aux jars not distributed to Yarn cluster, and job will fail 
without dependent jars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH

2014-04-21 Thread Peng Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peng Zhang updated HIVE-6948:
-

Attachment: HIVE-6948.patch

 HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
 --

 Key: HIVE-6948
 URL: https://issues.apache.org/jira/browse/HIVE-6948
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Peng Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6948.patch


 HiveServer2 ignores HIVE_AUX_JARS_PATH.
 This will cause aux jars not distributed to Yarn cluster, and job will fail 
 without dependent jars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH

2014-04-21 Thread Peng Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peng Zhang updated HIVE-6948:
-

Status: Patch Available  (was: Open)

 HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
 --

 Key: HIVE-6948
 URL: https://issues.apache.org/jira/browse/HIVE-6948
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Peng Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6948.patch


 HiveServer2 ignores HIVE_AUX_JARS_PATH.
 This will cause aux jars not distributed to Yarn cluster, and job will fail 
 without dependent jars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6941) [WebHCat] Implement webhcat endpoint version/sqoop

2014-04-21 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6941:
-

Description: Since WebHCat support invoking Sqoop job (introduced in 
HIVE-5072), it should also expose endpoint version/sqoop to return the 
version of Sqoop used. In HIVE-5072, the endpoint version/sqoop is exposed 
return NOT_IMPLEMENTED_501.  (was: Since WebHCat support invoking Sqoop job 
(introduced in HIVE-5072), it should also expose endpoint version/sqoop to 
return the version of Sqoop used.)

 [WebHCat] Implement webhcat endpoint version/sqoop
 

 Key: HIVE-6941
 URL: https://issues.apache.org/jira/browse/HIVE-6941
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Reporter: Shuaishuai Nie

 Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should 
 also expose endpoint version/sqoop to return the version of Sqoop used. In 
 HIVE-5072, the endpoint version/sqoop is exposed return 
 NOT_IMPLEMENTED_501.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6941) [WebHCat] Complete implementation of webhcat endpoint version/sqoop

2014-04-21 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6941:
-

Summary: [WebHCat] Complete implementation of webhcat endpoint 
version/sqoop  (was: [WebHCat] Implement webhcat endpoint version/sqoop)

 [WebHCat] Complete implementation of webhcat endpoint version/sqoop
 -

 Key: HIVE-6941
 URL: https://issues.apache.org/jira/browse/HIVE-6941
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Reporter: Shuaishuai Nie

 Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should 
 also expose endpoint version/sqoop to return the version of Sqoop used. In 
 HIVE-5072, the endpoint version/sqoop is exposed return 
 NOT_IMPLEMENTED_501.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6941) [WebHCat] Complete implementation of webhcat endpoint version/sqoop

2014-04-21 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6941:
-

Description: Since WebHCat support invoking Sqoop job (introduced in 
HIVE-5072), it should also expose endpoint version/sqoop to return the 
version of Sqoop interactive. In HIVE-5072, the endpoint version/sqoop is 
exposed return NOT_IMPLEMENTED_501. The reason is we cannot simply do the 
same as the endpoint version/hive or version/hadoop since WebHCat does not 
have dependency with Sqoop. Currently Sqoop 1 support getting the version using 
command sqoop version. WebHCat can invoke this command using 
templeton/v1/sqoop endpoint but this is not interactive.  (was: Since WebHCat 
support invoking Sqoop job (introduced in HIVE-5072), it should also expose 
endpoint version/sqoop to return the version of Sqoop used. In HIVE-5072, the 
endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501.)

 [WebHCat] Complete implementation of webhcat endpoint version/sqoop
 -

 Key: HIVE-6941
 URL: https://issues.apache.org/jira/browse/HIVE-6941
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Reporter: Shuaishuai Nie

 Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should 
 also expose endpoint version/sqoop to return the version of Sqoop 
 interactive. In HIVE-5072, the endpoint version/sqoop is exposed return 
 NOT_IMPLEMENTED_501. The reason is we cannot simply do the same as the 
 endpoint version/hive or version/hadoop since WebHCat does not have 
 dependency with Sqoop. Currently Sqoop 1 support getting the version using 
 command sqoop version. WebHCat can invoke this command using 
 templeton/v1/sqoop endpoint but this is not interactive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action

2014-04-21 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6940:
-

Description: 
WebHCat documentation need to be updated based on the new feature introduced in 
HIVE-5072

Here is some examples using the endpoint templeton/v1/sqoop

example1: (passing Sqoop command directly)
curl -s -d command=import --connect 
jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password
 --table mytable --target-dir user/hadoop/importtable -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

example2: (passing source file which contains sqoop command)
curl -s -d optionsfile=/sqoopcommand/command0.txt  -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

example3: (using --options-file in the middle of sqoop command to enable reuse 
part of Sqoop command like connection string)
curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d 
command=import --options-file command1.txt --options-file command2.txt -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

Also, for user to pass their JDBC driver jar, they can use the -libjars 
generic option in the Sqoop command. This is a functionality provided by Sqoop.

Set of parameters can be passed to the endpoint:
command (Sqoop command string to run)
optionsfile (Options file which contain Sqoop command need to run, each section 
in the Sqoop command separated by space should be a single line in the options 
file)
files (Comma seperated files to be copied to the map reduce cluster)
statusdir (A directory where WebHCat will write the status of the Sqoop job. If 
provided, it is the caller’s responsibility to remove this directory when done)
callback (Define a URL to be called upon job 
completion. You may 
embed a specific job 
ID into the URL using 
$jobId. This tag will 
be replaced in the 
callback URL with the 
job’s job ID. 
)


  was:WebHCat documentation need to be updated based on the new feature 
introduced in HIVE-5072


 [WebHCat]Update documentation for Templeton-Sqoop action
 

 Key: HIVE-6940
 URL: https://issues.apache.org/jira/browse/HIVE-6940
 Project: Hive
  Issue Type: Bug
  Components: Documentation, WebHCat
Affects Versions: 0.14.0
Reporter: Shuaishuai Nie

 WebHCat documentation need to be updated based on the new feature introduced 
 in HIVE-5072
 Here is some examples using the endpoint templeton/v1/sqoop
 example1: (passing Sqoop command directly)
 curl -s -d command=import --connect 
 jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password
  --table mytable --target-dir user/hadoop/importtable -d 
 statusdir=sqoop.output 
 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'
 example2: (passing source file which contains sqoop command)
 curl -s -d optionsfile=/sqoopcommand/command0.txt  -d 
 statusdir=sqoop.output 
 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'
 example3: (using --options-file in the middle of sqoop command to enable 
 reuse part of Sqoop command like connection string)
 curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d 
 command=import --options-file command1.txt --options-file command2.txt -d 
 statusdir=sqoop.output 
 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'
 Also, for user to pass their JDBC driver jar, they can use the -libjars 
 generic option in the Sqoop command. This is a functionality provided by 
 Sqoop.
 Set of parameters can be passed to the endpoint:
 command (Sqoop command string to run)
 optionsfile (Options file which contain Sqoop command need to run, each 
 section in the Sqoop command separated by space should be a single line in 
 the options file)
 files (Comma seperated files to be copied to the map reduce cluster)
 statusdir (A directory where WebHCat will write the status of the Sqoop job. 
 If provided, it is the caller’s responsibility to remove this directory when 
 done)
 callback (Define a URL to be called upon job 
 completion. You may 
 embed a specific job 
 ID into the URL using 
 $jobId. This tag will 
 be replaced in the 
 callback URL with the 
 job’s job ID. 
 )



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action

2014-04-21 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6940:
-

Description: 
WebHCat documentation need to be updated based on the new feature introduced in 
HIVE-5072

Here is some examples using the endpoint templeton/v1/sqoop

example1: (passing Sqoop command directly)
curl -s -d command=import --connect 
jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password
 --table mytable --target-dir user/hadoop/importtable -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

example2: (passing source file which contains sqoop command)
curl -s -d optionsfile=/sqoopcommand/command0.txt  -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

example3: (using --options-file in the middle of sqoop command to enable reuse 
part of Sqoop command like connection string)
curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d 
command=import --options-file command1.txt --options-file command2.txt -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

Also, for user to pass their JDBC driver jar, they can use the -libjars 
generic option in the Sqoop command. This is a functionality provided by Sqoop.

Set of parameters can be passed to the endpoint:
command 
(Sqoop command string to run)
optionsfile
(Options file which contain Sqoop command need to run, each section in the 
Sqoop command separated by space should be a single line in the options file)
files 
(Comma seperated files to be copied to the map reduce cluster)
statusdir 
(A directory where WebHCat will write the status of the Sqoop job. If provided, 
it is the caller’s responsibility to remove this directory when done)
callback 
(Define a URL to be called upon job completion. You may embed a specific job ID 
into the URL using $jobId. This tag will be replaced in the callback URL with 
the job’s job ID. )
enablelog
(when set to true, WebHCat will upload job log to statusdir. Need to define 
statusdir when enabled)

All the above parameters are optional, but use have to provide either command 
or optionsfile in the command.


  was:
WebHCat documentation need to be updated based on the new feature introduced in 
HIVE-5072

Here is some examples using the endpoint templeton/v1/sqoop

example1: (passing Sqoop command directly)
curl -s -d command=import --connect 
jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password
 --table mytable --target-dir user/hadoop/importtable -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

example2: (passing source file which contains sqoop command)
curl -s -d optionsfile=/sqoopcommand/command0.txt  -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

example3: (using --options-file in the middle of sqoop command to enable reuse 
part of Sqoop command like connection string)
curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d 
command=import --options-file command1.txt --options-file command2.txt -d 
statusdir=sqoop.output 
'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop'

Also, for user to pass their JDBC driver jar, they can use the -libjars 
generic option in the Sqoop command. This is a functionality provided by Sqoop.

Set of parameters can be passed to the endpoint:
command (Sqoop command string to run)
optionsfile (Options file which contain Sqoop command need to run, each section 
in the Sqoop command separated by space should be a single line in the options 
file)
files (Comma seperated files to be copied to the map reduce cluster)
statusdir (A directory where WebHCat will write the status of the Sqoop job. If 
provided, it is the caller’s responsibility to remove this directory when done)
callback (Define a URL to be called upon job 
completion. You may 
embed a specific job 
ID into the URL using 
$jobId. This tag will 
be replaced in the 
callback URL with the 
job’s job ID. 
)



 [WebHCat]Update documentation for Templeton-Sqoop action
 

 Key: HIVE-6940
 URL: https://issues.apache.org/jira/browse/HIVE-6940
 Project: Hive
  Issue Type: Bug
  Components: Documentation, WebHCat
Affects Versions: 0.14.0
Reporter: Shuaishuai Nie

 WebHCat documentation need to be updated based on the new feature introduced 
 in HIVE-5072
 Here is some examples using the endpoint templeton/v1/sqoop
 example1: (passing Sqoop command directly)
 curl -s -d command=import --connect 
 jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password
  --table mytable --target-dir user/hadoop/importtable -d 
 statusdir=sqoop.output 
 

[jira] [Commented] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976362#comment-13976362
 ] 

Hive QA commented on HIVE-6944:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12641125/HIVE-6944.patch

{color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 5417 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testListPartitions
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testNameMethods
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartition
org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapRedPlan3
org.apache.hive.hcatalog.mapreduce.TestHCatMultiOutputFormat.org.apache.hive.hcatalog.mapreduce.TestHCatMultiOutputFormat
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/1/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/1/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 48 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12641125

 WebHCat e2e tests broken by HIVE-6432
 -

 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects 

[jira] [Commented] (HIVE-6469) skipTrash option in hive command line

2014-04-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976374#comment-13976374
 ] 

Xuefu Zhang commented on HIVE-6469:
---

{quote}
set hive.warehouse.data.skipTrash = true – explicitly set
drop table large10TBTable – this will skip trash
drop table anyOtherTable – this will skip trash
set hive.warehouse.data.skipTrash = false – if you forget this, it will 
skipTrash forever, until corrected.
drop table regularTable – this will start placing data in trash
{quote}
Actually I mean hive.warehouse.data.skipTrash to be an admin property that 
normal user will be able to set. Thus, the server will have this either on or 
off. I expect a prod server will have this on while a dev server will have this 
off. Isn't this good enough? Setting this on/off based on prod/dev seems more 
reasonable than on table size. If you are in a dev environment, you just 
disable the feature and why do you care whether the table is big or small. In a 
prod environment, on the other hand, everything table is important so the 
feature should be always on. Anything else I'm missing here?


 skipTrash option in hive command line
 -

 Key: HIVE-6469
 URL: https://issues.apache.org/jira/browse/HIVE-6469
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.12.0
Reporter: Jayesh
 Fix For: 0.12.1

 Attachments: HIVE-6469.patch


 hive drop table command deletes the data from HDFS warehouse and puts it into 
 Trash.
 Currently there is no way to provide flag to tell warehouse to skip trash 
 while deleting table data.
 This ticket is to add skipTrash feature in hive command-line, that looks as 
 following. 
 hive -e drop table skipTrash testTable
 This would be good feature to add, so that user can specify when not to put 
 data into trash directory and thus not to fill hdfs space instead of relying 
 on trash interval and policy configuration to take care of disk filling issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6469) skipTrash option in hive command line

2014-04-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976376#comment-13976376
 ] 

Xuefu Zhang commented on HIVE-6469:
---

{quote}
 that normal user will be able to set.
{quote}
I meant to say that normal user will NOT be able to set.

 skipTrash option in hive command line
 -

 Key: HIVE-6469
 URL: https://issues.apache.org/jira/browse/HIVE-6469
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.12.0
Reporter: Jayesh
 Fix For: 0.12.1

 Attachments: HIVE-6469.patch


 hive drop table command deletes the data from HDFS warehouse and puts it into 
 Trash.
 Currently there is no way to provide flag to tell warehouse to skip trash 
 while deleting table data.
 This ticket is to add skipTrash feature in hive command-line, that looks as 
 following. 
 hive -e drop table skipTrash testTable
 This would be good feature to add, so that user can specify when not to put 
 data into trash directory and thus not to fill hdfs space instead of relying 
 on trash interval and policy configuration to take care of disk filling issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Hive Contributor

2014-04-21 Thread Ashutosh Chauhan
Welcome aboard, Naveen!
I have added you as contributor to project. Looking forward to your
contributions to Hive.

Ashutosh


On Mon, Apr 21, 2014 at 7:18 PM, Naveen Gangam ngan...@cloudera.com wrote:

 Dear Hive PMC,
 I would like to contribute to the HIVE community. Could you please grant me
 the contributor role?

 My apache username is ngangam. Thank you in advance and I am looking
 forward to becoming a part of the Hive community.

 --

 Thanks,
 Naveen :)



[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6944:
-

Description: 
HIVE-6432 removed templeton/v/queue REST endpoint and broke webhcat e2e tests

NO PRECOMMIT TESTS

  was:
HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests

NO PRECOMMIT TESTS


 WebHCat e2e tests broken by HIVE-6432
 -

 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6944.patch


 HIVE-6432 removed templeton/v/queue REST endpoint and broke webhcat e2e tests
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432

2014-04-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976384#comment-13976384
 ] 

Eugene Koifman commented on HIVE-6944:
--

in spite of NO PRECOMMIT TESTS it still ran the tests
in any case, this is WebHCat only change so these test failures are not related

 WebHCat e2e tests broken by HIVE-6432
 -

 Key: HIVE-6944
 URL: https://issues.apache.org/jira/browse/HIVE-6944
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6944.patch


 HIVE-6432 removed templeton/v/queue REST endpoint and broke webhcat e2e tests
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6941) [WebHCat] Complete implementation of webhcat endpoint version/sqoop

2014-04-21 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6941:
-

Affects Version/s: 0.14.0

 [WebHCat] Complete implementation of webhcat endpoint version/sqoop
 -

 Key: HIVE-6941
 URL: https://issues.apache.org/jira/browse/HIVE-6941
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Shuaishuai Nie

 Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should 
 also expose endpoint version/sqoop to return the version of Sqoop 
 interactive. In HIVE-5072, the endpoint version/sqoop is exposed return 
 NOT_IMPLEMENTED_501. The reason is we cannot simply do the same as the 
 endpoint version/hive or version/hadoop since WebHCat does not have 
 dependency with Sqoop. Currently Sqoop 1 support getting the version using 
 command sqoop version. WebHCat can invoke this command using 
 templeton/v1/sqoop endpoint but this is not interactive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton

2014-04-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976390#comment-13976390
 ] 

Eugene Koifman commented on HIVE-5072:
--

+1 (non binding)

 [WebHCat]Enable directly invoke Sqoop job through Templeton
 ---

 Key: HIVE-5072
 URL: https://issues.apache.org/jira/browse/HIVE-5072
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, 
 HIVE-5072.4.patch, HIVE-5072.5.patch, Templeton-Sqoop-Action.pdf


 Now it is hard to invoke a Sqoop job through templeton. The only way is to 
 use the classpath jar generated by a sqoop job and use the jar delegator in 
 Templeton. We should implement Sqoop Delegator to enable directly invoke 
 Sqoop job through Templeton.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6924) MapJoinKeyBytes::hashCode() should use Murmur hash

2014-04-21 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976406#comment-13976406
 ] 

Remus Rusanu commented on HIVE-6924:


[~jnp] I created HIVE-6949 for the vectorized hash

 MapJoinKeyBytes::hashCode() should use Murmur hash
 --

 Key: HIVE-6924
 URL: https://issues.apache.org/jira/browse/HIVE-6924
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6924.patch


 Existing hashCode is bad, causes HashMap to cluster



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6949) VectorHashKeyWrapper hashCode() should use Murmur hash

2014-04-21 Thread Remus Rusanu (JIRA)
Remus Rusanu created HIVE-6949:
--

 Summary: VectorHashKeyWrapper hashCode() should use Murmur hash
 Key: HIVE-6949
 URL: https://issues.apache.org/jira/browse/HIVE-6949
 Project: Hive
  Issue Type: Improvement
Reporter: Remus Rusanu
Assignee: Remus Rusanu


HIVE-6924 replaced the hash of MapJoinKeyBytes with MurmurHash algorithm. 
Vectorized hash should do the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5771) Constant propagation optimizer for Hive

2014-04-21 Thread Ted Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976410#comment-13976410
 ] 

Ted Xu commented on HIVE-5771:
--

Hi [~ashutoshc], thanks for the patch, I will look into this.

 Constant propagation optimizer for Hive
 ---

 Key: HIVE-5771
 URL: https://issues.apache.org/jira/browse/HIVE-5771
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Ted Xu
Assignee: Ted Xu
 Attachments: HIVE-5771.1.patch, HIVE-5771.2.patch, HIVE-5771.3.patch, 
 HIVE-5771.4.patch, HIVE-5771.5.patch, HIVE-5771.6.patch, HIVE-5771.7.patch, 
 HIVE-5771.8.patch, HIVE-5771.patch


 Currently there is no constant folding/propagation optimizer, all expressions 
 are evaluated at runtime. 
 HIVE-2470 did a great job on evaluating constants on UDF initializing phase, 
 however, it is still a runtime evaluation and it doesn't propagate constants 
 from a subquery to outside.
 It may reduce I/O and accelerate process if we introduce such an optimizer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)