date:20160419

[jira] [Commented] (HIVE-13553) CTE with upperCase alias throws exception

2016-04-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249306#comment-15249306
 ] 

Ashutosh Chauhan commented on HIVE-13553:
-

+1 pending tests

> CTE with upperCase alias throws exception
> -
>
> Key: HIVE-13553
> URL: https://issues.apache.org/jira/browse/HIVE-13553
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13553.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13553) CTE with upperCase alias throws exception

2016-04-19 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13553:

Status: Patch Available  (was: Open)

> CTE with upperCase alias throws exception
> -
>
> Key: HIVE-13553
> URL: https://issues.apache.org/jira/browse/HIVE-13553
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13553.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13421) Propagate job progress in operation status

2016-04-19 Thread Amareshwari Sriramadasu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249269#comment-15249269
 ] 

Amareshwari Sriramadasu commented on HIVE-13421:


+1
https://issues.apache.org/jira/secure/attachment/12799456/HIVE-13421.04.patch 
looks good.

> Propagate job progress in operation status
> --
>
> Key: HIVE-13421
> URL: https://issues.apache.org/jira/browse/HIVE-13421
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-13421.01.patch, HIVE-13421.02.patch, 
> HIVE-13421.03.patch, HIVE-13421.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13480) Add hadoop2 metrics reporter for Codahale metrics

2016-04-19 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249113#comment-15249113
 ] 

Lefty Leverenz commented on HIVE-13480:
---

Iff you make another patch, please spell out HMS and HS2 in the description of 
*hive.service.metrics.hadoop2.component* (even though they're clear enough from 
context, and spelling them out seems redundant):

{quote}
+
HIVE_METRICS_HADOOP2_COMPONENT_NAME("hive.service.metrics.hadoop2.component",
+"hive",
+"Component name to provide to Hadoop2 Metrics system. Ideally 
'hivemetastore' for HMS and 'hiveserver2' for HS2."
+),
{quote}

> Add hadoop2 metrics reporter for Codahale metrics
> -
>
> Key: HIVE-13480
> URL: https://issues.apache.org/jira/browse/HIVE-13480
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13480.2.patch, HIVE-13480.3.patch, HIVE-13480.patch
>
>
> Multiple other apache components allow sending metrics over to Hadoop2 
> metrics, which allow for monitoring solutions like Ambari Metrics Server to 
> work against that to show metrics for components in one place. Our Codahale 
> metrics works very well, so ideally, we would like to bridge the two, to 
> allow Codahale to add a Hadoop2 reporter that enables us to continue to use 
> Codahale metrics (i.e. not write another custom metrics impl) but report 
> using Hadoop2.
> Apache Phoenix also had such a recent usecase and were in the process of 
> adding in a stub piece that allows this forwarding. We should use the same 
> reporter to minimize redundancy while pushing metrics to a centralized 
> solution like Hadoop2 Metrics/AMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13480) Add hadoop2 metrics reporter for Codahale metrics

2016-04-19 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249096#comment-15249096
 ] 

Thejas M Nair commented on HIVE-13480:
--

+1
I think we should look into setting the component automatically from 
HS2/metastore startup code instead of setting it via a config.
But that can be a follow up task.


> Add hadoop2 metrics reporter for Codahale metrics
> -
>
> Key: HIVE-13480
> URL: https://issues.apache.org/jira/browse/HIVE-13480
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13480.2.patch, HIVE-13480.3.patch, HIVE-13480.patch
>
>
> Multiple other apache components allow sending metrics over to Hadoop2 
> metrics, which allow for monitoring solutions like Ambari Metrics Server to 
> work against that to show metrics for components in one place. Our Codahale 
> metrics works very well, so ideally, we would like to bridge the two, to 
> allow Codahale to add a Hadoop2 reporter that enables us to continue to use 
> Codahale metrics (i.e. not write another custom metrics impl) but report 
> using Hadoop2.
> Apache Phoenix also had such a recent usecase and were in the process of 
> adding in a stub piece that allows this forwarding. We should use the same 
> reporter to minimize redundancy while pushing metrics to a centralized 
> solution like Hadoop2 Metrics/AMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13552) Templeton job does not write out log files on InterruptedException

2016-04-19 Thread Dennis Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Chan updated HIVE-13552:
---
Summary: Templeton job does not write out log files on InterruptedException 
 (was: Templeton job does not write out log files on IOException)

> Templeton job does not write out log files on InterruptedException
> --
>
> Key: HIVE-13552
> URL: https://issues.apache.org/jira/browse/HIVE-13552
> Project: Hive
>  Issue Type: Bug
>Reporter: Dennis Chan
>Assignee: Dennis Chan
>Priority: Minor
> Attachments: HIVE-13552.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249076#comment-15249076
 ] 

Lefty Leverenz commented on HIVE-12049:
---

+1 for typo fixes and descriptions of configuration parameters

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.0
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13543) HiveServer2: Add unit tests for testing configurable LDAP user key name, group membership key name and group class name

2016-04-19 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13543:

Assignee: (was: Vaibhav Gumashta)

> HiveServer2: Add unit tests for testing configurable LDAP user key name, 
> group membership key name and group class name  
> -
>
> Key: HIVE-13543
> URL: https://issues.apache.org/jira/browse/HIVE-13543
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Vaibhav Gumashta
>
> HIVE-13295 made the above mentioned properties configurable. We should add 
> unit tests for these.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13553) CTE with upperCase alias throws exception

2016-04-19 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13553:
---
Attachment: HIVE-13553.01.patch

[~ashutoshc], could u take a look? Thanks.

> CTE with upperCase alias throws exception
> -
>
> Key: HIVE-13553
> URL: https://issues.apache.org/jira/browse/HIVE-13553
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13553.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13395) Lost Update problem in ACID

2016-04-19 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249040#comment-15249040
 ] 

Eugene Koifman commented on HIVE-13395:
---

This implements that idea that if 2 concurrent transactions have a Write/Write 
conflict, one must be aborted and First-Committer-Wins rule is used to decide 
which.

It's implemented using Write-Set tracking so that it can be extended to 
multi-statement txns in the future.   The check on lock acquisition is an 
optimization, while commitTxn() logic is the ultimate authority.  ( It also 
makes the logic cleaner in particular around properly mutexing operations, 
which have to be using RDBMS.  I started with your idea to check the latest 
writers txn id vs currently committing one but it ended up more difficult using 
a DB)

TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS have different retention policies.  
These are governed by compaction rather than transaction "liveness" which don't 
necessarily match.


> Lost Update problem in ACID
> ---
>
> Key: HIVE-13395
> URL: https://issues.apache.org/jira/browse/HIVE-13395
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-13395.6.patch, HIVE-13395.7.patch
>
>
> ACID users can run into Lost Update problem.
> In Hive 1.2, Driver.recordValidTxns() (which records the snapshot to use for 
> the query) is called in Driver.compile().
> Now suppose to concurrent "update T set x = x + 1" are executed.  (for 
> simplicity assume there is exactly 1 row in T)
> What can happen is that both compile at the same time (more precisely before 
> acquireLocksAndOpenTxn() in runInternal() is called) and thus will lock in 
> the same snapshot, say the value of x = 7 in this snapshot.
> Now 1 will get the lock on the row, the second will block.  
> Now 1, makes x = 8 and commits.
> Now 2 proceeds and makes x = 8 again since in it's snapshot x is still 7.
> This specific issue is solved in Hive 1.3/2.0 (HIVE-11077 which is a large 
> patch that deals with multi-statement txns) by moving recordValidTxns() after 
> locks are acquired which reduces the likelihood of this but doesn't eliminate 
> the problem.
> 
> Even in 1.3 version of the code, you could have the same issue.  Assume the 
> same 2 queries:
> Both start a txn, say txnid 9 and 10.  Say 10 gets the lock first, 9 blocks.
> 10 updates the row (so x = 8) and thus ReaderKey.currentTransactionId=10.
> 10 commits.
> Now 9 can proceed and it will get a snapshot that includes 10, i.e. it will 
> see x = 8 and it will write x = 9, but it will set 
> ReaderKey.currentTransactionId = 9.  Thus when merge logic runs, it will see 
> x = 8 is the later version of this row, i.e. lost update.
> The problem is that locks alone are insufficient for MVCC architecture.  
> 
> At lower level Row ID has (originalTransactionId, rowid, bucket id, 
> currentTransactionId) and since on update/delete we do a table scan, we could 
> check that we are about to write a row with currentTransactionId < 
> (currentTransactionId of row we've read) and fail the query.  Currently, 
> currentTransactionId is not surfaced at higher level where this check can be 
> made.
> This would not work (efficiently) longer term where we want to support fast 
> update on user defined PK vis streaming ingest.
> Also, this would not work with multi statement txns since in that case we'd 
> lock in the snapshot at the start of the txn, but then 2nd, 3rd etc queries 
> would use the same snapshot and the locks for these queries would be acquired 
> after the snapshot is locked in so this would be the same situation as pre 
> HIVE-11077.
> 
>  
> A more robust solution (commonly used with MVCC) is to keep track of start 
> and commit time (logical counter) or each transaction to detect if two txns 
> overlap.  The 2nd part is to keep track of write-set, i.e. which data (rows, 
> partitions, whatever appropriate level of granularity is) were modified by 
> any txn and if 2 txns overlap in time and wrote the same element, abort later 
> one.  This is called first-committer-wins rule.  This requires a MS DB schema 
> change
> It would be most convenient to use the same sequence for txnId, start and 
> commit time (in which case txnid=start time).  In this case we'd need to add 
> 1 filed to TXNS table.  The complication here is that we'll be using elements 
> of the sequence faster and they are used as part of file name of delta and 
> base dir and currently limited to 7 digits which can be exceeded.  So this 
> would require some thought to handling upgrade/migration.
> Also, write-set tracking requires either additional metastore table or 
> keeping info in HIVE_LOCKS around

[jira] [Updated] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive

2016-04-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13290:
-
Attachment: HIVE-13290.6.patch

cc-ing [~ashutoshc] for review. Will add unit tests soon.

> Support primary keys/foreign keys constraint as part of create table command 
> in Hive
> 
>
> Key: HIVE-13290
> URL: https://issues.apache.org/jira/browse/HIVE-13290
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, 
> HIVE-13290.3.patch, HIVE-13290.4.patch, HIVE-13290.5.patch, HIVE-13290.6.patch
>
>
> SUPPORT for the following statements
> {code}
> CREATE TABLE product 
>   ( 
>  product_idINTEGER, 
>  product_vendor_id INTEGER, 
>  PRIMARY KEY (product_id), 
>  CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
> vendor(vendor_id)  DISABLE NOVALIDATE
>   ); 
> CREATE TABLE vendor 
>   ( 
>  vendor_id INTEGER, 
>  PRIMARY KEY (vendor_id)  DISABLE NOVALIDATE RELY
>   ); 
> {code}
> In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
> specified by the user, we will use system generated constraint name. For the 
> purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
> not primary key since there is only one primary key per table. The 
> RELY/NORELY keyword is also optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive

2016-04-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13290:
-
Description: 
SUPPORT for the following statements
{code}
CREATE TABLE product 
  ( 
 product_idINTEGER, 
 product_vendor_id INTEGER, 
 PRIMARY KEY (product_id), 
 CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
vendor(vendor_id)  DISABLE NOVALIDATE
  ); 

CREATE TABLE vendor 
  ( 
 vendor_id INTEGER, 
 PRIMARY KEY (vendor_id)  DISABLE NOVALIDATE RELY
  ); 
{code}

In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
specified by the user, we will use system generated constraint name. For the 
purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
not primary key since there is only one primary key per table. The RELY/NORELY 
keyword is also optional.


  was:
SUPPORT for the following statements
{code}
CREATE TABLE product 
  ( 
 product_idINTEGER, 
 product_vendor_id INTEGER, 
 PRIMARY KEY (product_id), 
 CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
vendor(vendor_id)  DISABLE NOVALIDATE
  ); 

CREATE TABLE vendor 
  ( 
 vendor_id INTEGER, 
 PRIMARY KEY (vendor_id)  DISABLE NOVALIDATE
  ); 
{code}

In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
specified by the user, we will use system generated constraint name. For the 
purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
not primary key since there is only one primary key per table.


> Support primary keys/foreign keys constraint as part of create table command 
> in Hive
> 
>
> Key: HIVE-13290
> URL: https://issues.apache.org/jira/browse/HIVE-13290
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, 
> HIVE-13290.3.patch, HIVE-13290.4.patch, HIVE-13290.5.patch
>
>
> SUPPORT for the following statements
> {code}
> CREATE TABLE product 
>   ( 
>  product_idINTEGER, 
>  product_vendor_id INTEGER, 
>  PRIMARY KEY (product_id), 
>  CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
> vendor(vendor_id)  DISABLE NOVALIDATE
>   ); 
> CREATE TABLE vendor 
>   ( 
>  vendor_id INTEGER, 
>  PRIMARY KEY (vendor_id)  DISABLE NOVALIDATE RELY
>   ); 
> {code}
> In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
> specified by the user, we will use system generated constraint name. For the 
> purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
> not primary key since there is only one primary key per table. The 
> RELY/NORELY keyword is also optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive

2016-04-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13290:
-
Description: 
SUPPORT for the following statements
{code}
CREATE TABLE product 
  ( 
 product_idINTEGER, 
 product_vendor_id INTEGER, 
 PRIMARY KEY (product_id), 
 CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
vendor(vendor_id)  DISABLE NOVALIDATE
  ); 

CREATE TABLE vendor 
  ( 
 vendor_id INTEGER, 
 PRIMARY KEY (vendor_id)  DISABLE NOVALIDATE
  ); 
{code}

In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
specified by the user, we will use system generated constraint name. For the 
purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
not primary key since there is only one primary key per table.

  was:
SUPPORT for the following statements
{code}
CREATE TABLE product 
  ( 
 product_idINTEGER, 
 product_vendor_id INTEGER, 
 PRIMARY KEY (product_id), 
 CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
vendor(vendor_id) 
  ); 

CREATE TABLE vendor 
  ( 
 vendor_id INTEGER, 
 PRIMARY KEY (vendor_id) 
  ); 
{code}

In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
specified by the user, we will use system generated constraint name. For the 
purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
not primary key since there is only one primary key per table.


> Support primary keys/foreign keys constraint as part of create table command 
> in Hive
> 
>
> Key: HIVE-13290
> URL: https://issues.apache.org/jira/browse/HIVE-13290
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, 
> HIVE-13290.3.patch, HIVE-13290.4.patch, HIVE-13290.5.patch
>
>
> SUPPORT for the following statements
> {code}
> CREATE TABLE product 
>   ( 
>  product_idINTEGER, 
>  product_vendor_id INTEGER, 
>  PRIMARY KEY (product_id), 
>  CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
> vendor(vendor_id)  DISABLE NOVALIDATE
>   ); 
> CREATE TABLE vendor 
>   ( 
>  vendor_id INTEGER, 
>  PRIMARY KEY (vendor_id)  DISABLE NOVALIDATE
>   ); 
> {code}
> In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
> specified by the user, we will use system generated constraint name. For the 
> purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
> not primary key since there is only one primary key per table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13541) Pass view's ColumnAccessInfo to HiveAuthorizer

2016-04-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248907#comment-15248907
 ] 

Ashutosh Chauhan commented on HIVE-13541:
-

+1 pending tests

> Pass view's ColumnAccessInfo to HiveAuthorizer
> --
>
> Key: HIVE-13541
> URL: https://issues.apache.org/jira/browse/HIVE-13541
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-13541.01.patch
>
>
> RIght now, only table's ColumnAccessInfo is passed to HiveAuthorizer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13395) Lost Update problem in ACID

2016-04-19 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248872#comment-15248872
 ] 

Alan Gates commented on HIVE-13395:
---

This is not a complete review but I have a few high level questions/comments.  
Answering these will help me finish the rest of the review.

The description presents several possible courses of action.  Some comments on 
which one you chose and why would be good.

Why add a new WRITE_SET table?  Doesn't the TXN_COMPONENTS table have 
everything you need, since it tracks partitions/tables that were written to?  

Why the double check for a write conflict in commitTxn?  You're already 
checking the conflict when you acquire the shared-write lock.  Since the writer 
holds the lock no-one could possibly update that partition.  If we want to 
someday shift to let shared-write locks proceed together and do true first 
commit wins then checking in commitTxn makes sense.  Is that your plan?

> Lost Update problem in ACID
> ---
>
> Key: HIVE-13395
> URL: https://issues.apache.org/jira/browse/HIVE-13395
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-13395.6.patch, HIVE-13395.7.patch
>
>
> ACID users can run into Lost Update problem.
> In Hive 1.2, Driver.recordValidTxns() (which records the snapshot to use for 
> the query) is called in Driver.compile().
> Now suppose to concurrent "update T set x = x + 1" are executed.  (for 
> simplicity assume there is exactly 1 row in T)
> What can happen is that both compile at the same time (more precisely before 
> acquireLocksAndOpenTxn() in runInternal() is called) and thus will lock in 
> the same snapshot, say the value of x = 7 in this snapshot.
> Now 1 will get the lock on the row, the second will block.  
> Now 1, makes x = 8 and commits.
> Now 2 proceeds and makes x = 8 again since in it's snapshot x is still 7.
> This specific issue is solved in Hive 1.3/2.0 (HIVE-11077 which is a large 
> patch that deals with multi-statement txns) by moving recordValidTxns() after 
> locks are acquired which reduces the likelihood of this but doesn't eliminate 
> the problem.
> 
> Even in 1.3 version of the code, you could have the same issue.  Assume the 
> same 2 queries:
> Both start a txn, say txnid 9 and 10.  Say 10 gets the lock first, 9 blocks.
> 10 updates the row (so x = 8) and thus ReaderKey.currentTransactionId=10.
> 10 commits.
> Now 9 can proceed and it will get a snapshot that includes 10, i.e. it will 
> see x = 8 and it will write x = 9, but it will set 
> ReaderKey.currentTransactionId = 9.  Thus when merge logic runs, it will see 
> x = 8 is the later version of this row, i.e. lost update.
> The problem is that locks alone are insufficient for MVCC architecture.  
> 
> At lower level Row ID has (originalTransactionId, rowid, bucket id, 
> currentTransactionId) and since on update/delete we do a table scan, we could 
> check that we are about to write a row with currentTransactionId < 
> (currentTransactionId of row we've read) and fail the query.  Currently, 
> currentTransactionId is not surfaced at higher level where this check can be 
> made.
> This would not work (efficiently) longer term where we want to support fast 
> update on user defined PK vis streaming ingest.
> Also, this would not work with multi statement txns since in that case we'd 
> lock in the snapshot at the start of the txn, but then 2nd, 3rd etc queries 
> would use the same snapshot and the locks for these queries would be acquired 
> after the snapshot is locked in so this would be the same situation as pre 
> HIVE-11077.
> 
>  
> A more robust solution (commonly used with MVCC) is to keep track of start 
> and commit time (logical counter) or each transaction to detect if two txns 
> overlap.  The 2nd part is to keep track of write-set, i.e. which data (rows, 
> partitions, whatever appropriate level of granularity is) were modified by 
> any txn and if 2 txns overlap in time and wrote the same element, abort later 
> one.  This is called first-committer-wins rule.  This requires a MS DB schema 
> change
> It would be most convenient to use the same sequence for txnId, start and 
> commit time (in which case txnid=start time).  In this case we'd need to add 
> 1 filed to TXNS table.  The complication here is that we'll be using elements 
> of the sequence faster and they are used as part of file name of delta and 
> base dir and currently limited to 7 digits which can be exceeded.  So this 
> would require some thought to handling upgrade/migration.
> Also, write-set tracking requires either additional metastore table or 
> keeping info in HIVE_LOCKS around longer with new state.
>

[jira] [Commented] (HIVE-10293) enabling travis-ci build?

2016-04-19 Thread Gabor Liptak (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248863#comment-15248863
 ] 

Gabor Liptak commented on HIVE-10293:
-

Thank you. So if the Maven version is the culprit, it works under 3.0.5 and 
3.3.3 (my local Linux) and fails under 3.2.5 (on travis-ci) ... Am I to try 
running against a current Maven?

> enabling travis-ci build?
> -
>
> Key: HIVE-10293
> URL: https://issues.apache.org/jira/browse/HIVE-10293
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Gabor Liptak
>Assignee: Gabor Liptak
>Priority: Minor
> Attachments: HIVE-10293.1.patch, HIVE-10293.2.diff
>
>
> I would like to contribute a .travis.yml for Hive.
> In particular, this would allow contributors working through Github, to 
> validate their own commits on their own branches.
> Please comment.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13550) Get rid of wrapped LlapInputSplit/InputFormat classes

2016-04-19 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere resolved HIVE-13550.
---
Resolution: Fixed

Committed to LLAP branch

> Get rid of wrapped LlapInputSplit/InputFormat classes
> -
>
> Key: HIVE-13550
> URL: https://issues.apache.org/jira/browse/HIVE-13550
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13550.1.patch
>
>
> LlapInputSplit/InputFormat classes are wrapped and referenced via 
> classname/reflection. We can probably do without these.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13552) Templeton job does not write out log files on IOException

2016-04-19 Thread Dennis Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Chan updated HIVE-13552:
---
Attachment: HIVE-13552.1.patch

> Templeton job does not write out log files on IOException
> -
>
> Key: HIVE-13552
> URL: https://issues.apache.org/jira/browse/HIVE-13552
> Project: Hive
>  Issue Type: Bug
>Reporter: Dennis Chan
>Assignee: Dennis Chan
>Priority: Minor
> Attachments: HIVE-13552.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13552) Templeton job does not write out log files on IOException

2016-04-19 Thread Dennis Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Chan updated HIVE-13552:
---
Attachment: (was: HIVE-13552.1.patch)

> Templeton job does not write out log files on IOException
> -
>
> Key: HIVE-13552
> URL: https://issues.apache.org/jira/browse/HIVE-13552
> Project: Hive
>  Issue Type: Bug
>Reporter: Dennis Chan
>Assignee: Dennis Chan
>Priority: Minor
> Attachments: HIVE-13552.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6476) Support Append with Dynamic Partitioning

2016-04-19 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248800#comment-15248800
 ] 

Sushanth Sowmyan commented on HIVE-6476:


Oh dear. I'm not certain I am able to list all out all the corner cases that I 
considered back when implementing the append functionality. However, from code, 
I do see one issue that I think is the primary blocker for this. If this can be 
resolved, I think we can move forward on this.

Consider the following dynamic-partition-write.

Say our current write spills to partitions p=3, p=4 & p=5. Before the write 
happened, say the table had data for partitions p=1, p=2 & p=3. So, now, our 
new write does not affect p=1=2, does an "append" for p=3, and creates a new 
partition for p=4 & p=5.

If we never had to consider p=3 in this case, then HCat would try to do the 
add_partitions call at the OutputCommitter side as an atomic call to the 
metastore to prevent complicated rollback logic with the MS, and if the 
add_partitions call succeeds, it will proceed to move data into the appropriate 
directories. Now that p=3 is also in the picture, add_partitions will fail 
since p=3 already exists.

Thus, we needed support an additional add_partitions_if_not_exist behaviour, 
which didn't exist then (but does now) and we weren't sure if it was the right 
call to add that(but now, that is moot).

The next aspect to consider is this:

In the case of a non-dynamic-ptn append, we try to move the new append files in 
with _N? suffixes if files of the same name already exist, and do so till we 
hit a max configurable limit - if we should fail in this file-movement phase, 
or determine we can't copy any of the items needed, then append can rollback 
trivially because we have no metadata change to roll back. There is no metadata 
update that is needed, and this part can't fail because of some other partition 
or because some other metadata needed updating that couldn't be updated.

If we allow dynamic-ptn appends, then for any reason, if we have to do a 
rollback, then, for any case where there is a metadata rollback or a fs 
rollback needed, then we need to make sure that the state is compatible with 
how it was before this operation started.

While these are gotchas, they are not unsurmountable, and as long as we are 
able to plan out how we handle the mix, this should be doable.

> Support Append with Dynamic Partitioning
> 
>
> Key: HIVE-6476
> URL: https://issues.apache.org/jira/browse/HIVE-6476
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore, Query Processor, Thrift API
>Reporter: Sushanth Sowmyan
>
> Currently, we do not support mixing dynamic partitioning and append in the 
> same job. One reason is that we need exhaustive testing of corner cases for 
> that, and a second reason is the behaviour of add_partitions. To support 
> dynamic partitioning with append, we'd have to have a 
> add_partitions_if_not_exist call, rather than an add_partitions call.
> Thus, the current implementation in HIVE-6475 assumes immutability for all 
> dynamic partitioning jobs, irrespective of whether or not the table is marked 
> as mutable or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13130) HS2 changes : API calls for retrieving primary keys and foreign keys information

2016-04-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248785#comment-15248785
 ] 

Ashutosh Chauhan commented on HIVE-13130:
-

+1 pending tests

>  HS2 changes : API calls for retrieving primary keys and foreign keys 
> information
> -
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch, 
> HIVE-13130.3.patch, HIVE-13130.4.patch, HIVE-13130.5.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13343) Need to disable hybrid grace hash join in llap mode except for dynamically partitioned hash join

2016-04-19 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248675#comment-15248675
 ] 

Vikram Dixit K commented on HIVE-13343:
---

[~hagleitn] could you review this please?

> Need to disable hybrid grace hash join in llap mode except for dynamically 
> partitioned hash join
> 
>
> Key: HIVE-13343
> URL: https://issues.apache.org/jira/browse/HIVE-13343
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-13343.1.patch, HIVE-13343.2.patch
>
>
> Due to performance reasons, we should disable use of hybrid grace hash join 
> in llap when dynamic partition hash join is not used. With dynamic partition 
> hash join, we need hybrid grace hash join due to the possibility of skews.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13342) Improve logging in llap decider and throw exception in case llap mode is all but we cannot run in llap.

2016-04-19 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248677#comment-15248677
 ] 

Vikram Dixit K commented on HIVE-13342:
---

[~hagleitn] could you review this please?

> Improve logging in llap decider and throw exception in case llap mode is all 
> but we cannot run in llap.
> ---
>
> Key: HIVE-13342
> URL: https://issues.apache.org/jira/browse/HIVE-13342
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-13342.1.patch, HIVE-13342.2.patch, 
> HIVE-13342.3.patch, HIVE-13342.4.patch
>
>
> Currently we do not log our decisions with respect to llap. Are we running 
> everything in llap mode or only parts of the plan. We need more logging. 
> Also, if llap mode is all but for some reason, we cannot run the work in llap 
> mode, fail and throw an exception advise the user to change the mode to auto.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13552) Templeton job does not write out log files on IOException

2016-04-19 Thread Dennis Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Chan updated HIVE-13552:
---
Status: Patch Available  (was: In Progress)

> Templeton job does not write out log files on IOException
> -
>
> Key: HIVE-13552
> URL: https://issues.apache.org/jira/browse/HIVE-13552
> Project: Hive
>  Issue Type: Bug
>Reporter: Dennis Chan
>Assignee: Dennis Chan
>Priority: Minor
> Attachments: HIVE-13552.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13552) Templeton job does not write out log files on IOException

2016-04-19 Thread Dennis Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Chan updated HIVE-13552:
---
Attachment: HIVE-13552.1.patch

> Templeton job does not write out log files on IOException
> -
>
> Key: HIVE-13552
> URL: https://issues.apache.org/jira/browse/HIVE-13552
> Project: Hive
>  Issue Type: Bug
>Reporter: Dennis Chan
>Assignee: Dennis Chan
>Priority: Minor
> Attachments: HIVE-13552.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-13552) Templeton job does not write out log files on IOException

2016-04-19 Thread Dennis Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13552 started by Dennis Chan.
--
> Templeton job does not write out log files on IOException
> -
>
> Key: HIVE-13552
> URL: https://issues.apache.org/jira/browse/HIVE-13552
> Project: Hive
>  Issue Type: Bug
>Reporter: Dennis Chan
>Assignee: Dennis Chan
>Priority: Minor
> Attachments: HIVE-13552.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13552) Templeton job does not write out log files on IOException

2016-04-19 Thread Dennis Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Chan updated HIVE-13552:
---
Summary: Templeton job does not write out log files on IOException  (was: 
Templeton job does not write out log files on exception)

> Templeton job does not write out log files on IOException
> -
>
> Key: HIVE-13552
> URL: https://issues.apache.org/jira/browse/HIVE-13552
> Project: Hive
>  Issue Type: Bug
>Reporter: Dennis Chan
>Assignee: Dennis Chan
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13551) Make cleardanglingscratchdir work on Windows

2016-04-19 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-13551:
--
Status: Patch Available  (was: Open)

> Make cleardanglingscratchdir work on Windows
> 
>
> Key: HIVE-13551
> URL: https://issues.apache.org/jira/browse/HIVE-13551
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-13551.1.patch
>
>
> See a couple of issues when running cleardanglingscratchdir on Windows, 
> includes:
> 1. dfs.support.append is set to false in Azure cluster, need an alternative 
> way when append is disabled
> 2. fix for cmd scripts
> 3. fix UT on Windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13551) Make cleardanglingscratchdir work on Windows

2016-04-19 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-13551:
--
Attachment: HIVE-13551.1.patch

> Make cleardanglingscratchdir work on Windows
> 
>
> Key: HIVE-13551
> URL: https://issues.apache.org/jira/browse/HIVE-13551
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-13551.1.patch
>
>
> See a couple of issues when running cleardanglingscratchdir on Windows, 
> includes:
> 1. dfs.support.append is set to false in Azure cluster, need an alternative 
> way when append is disabled
> 2. fix for cmd scripts
> 3. fix UT on Windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13130) HS2 changes : API calls for retrieving primary keys and foreign keys information

2016-04-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13130:
-
Attachment: HIVE-13130.5.patch

>  HS2 changes : API calls for retrieving primary keys and foreign keys 
> information
> -
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch, 
> HIVE-13130.3.patch, HIVE-13130.4.patch, HIVE-13130.5.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13509) HCatalog getSplits should ignore the partition with invalid path

2016-04-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248591#comment-15248591
 ] 

Chaoyu Tang commented on HIVE-13509:


[~szehon] [~mithun], could you review the patch, it has a new property 
hcat.input.ignore.invalid.path for the backwards compatibility. Thanks.

> HCatalog getSplits should ignore the partition with invalid path
> 
>
> Key: HIVE-13509
> URL: https://issues.apache.org/jira/browse/HIVE-13509
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13509.1.patch, HIVE-13509.patch
>
>
> It is quite common that there is the discrepancy between partition directory 
> and its HMS metadata, simply because the directory could be added/deleted 
> externally using hdfs shell command. Technically it should be fixed by MSCK 
> and alter table .. add/drop command etc, but sometimes it might not be 
> practical especially in a multi-tenant env. This discrepancy does not cause 
> any problem to Hive, Hive returns no rows for a partition with an invalid 
> (e.g. non-existing) path, but it fails the Pig load with HCatLoader, because 
> the HCatBaseInputFormat getSplits throws an error when getting a split for a 
> non-existing path. The error message might looks like:
> {code}
> Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does 
> not exist: 
> hdfs://xyz.com:8020/user/hive/warehouse/xyz/date=2016-01-01/country=BR
>   at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
>   at 
> org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:162)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13550) Get rid of wrapped LlapInputSplit/InputFormat classes

2016-04-19 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-13550:
--
Attachment: HIVE-13550.1.patch

Remove wrapper LlapInputSplit, and remove the wrapped LlapInputFormat.
Also remove the qfile test, since TestJdbcWithLlap provides an end-to-end test.

> Get rid of wrapped LlapInputSplit/InputFormat classes
> -
>
> Key: HIVE-13550
> URL: https://issues.apache.org/jira/browse/HIVE-13550
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13550.1.patch
>
>
> LlapInputSplit/InputFormat classes are wrapped and referenced via 
> classname/reflection. We can probably do without these.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12049:

Component/s: JDBC

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.0
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12049:

Affects Version/s: 2.0.0

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.0
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13342) Improve logging in llap decider and throw exception in case llap mode is all but we cannot run in llap.

2016-04-19 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248518#comment-15248518
 ] 

Vikram Dixit K commented on HIVE-13342:
---

Ping [~sershe]. Can you take a look as well please.

> Improve logging in llap decider and throw exception in case llap mode is all 
> but we cannot run in llap.
> ---
>
> Key: HIVE-13342
> URL: https://issues.apache.org/jira/browse/HIVE-13342
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-13342.1.patch, HIVE-13342.2.patch, 
> HIVE-13342.3.patch, HIVE-13342.4.patch
>
>
> Currently we do not log our decisions with respect to llap. Are we running 
> everything in llap mode or only parts of the plan. We need more logging. 
> Also, if llap mode is all but for some reason, we cannot run the work in llap 
> mode, fail and throw an exception advise the user to change the mode to auto.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248516#comment-15248516
 ] 

Vaibhav Gumashta commented on HIVE-12049:
-

+1 pending tests. 

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Rohit Dholakia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.25.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Rohit Dholakia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Status: Patch Available  (was: Open)

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Rohit Dholakia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Status: Open  (was: Patch Available)

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, 
> HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch, 
> new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10293) enabling travis-ci build?

2016-04-19 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248491#comment-15248491
 ] 

Sergio Peña commented on HIVE-10293:


It is Apache Maven 3.0.5

> enabling travis-ci build?
> -
>
> Key: HIVE-10293
> URL: https://issues.apache.org/jira/browse/HIVE-10293
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Gabor Liptak
>Assignee: Gabor Liptak
>Priority: Minor
> Attachments: HIVE-10293.1.patch, HIVE-10293.2.diff
>
>
> I would like to contribute a .travis.yml for Hive.
> In particular, this would allow contributors working through Github, to 
> validate their own commits on their own branches.
> Please comment.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information

2016-04-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13349:
-
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Patch committed to master, Thanks [~ashutoshc] for the review

> Metastore Changes : API calls for retrieving primary keys and foreign keys 
> information
> --
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.1.0
>
> Attachments: 13449.2.patch, HIVE-13349.1.patch, HIVE-13349.3.patch, 
> HIVE-13349.4.patch, HIVE-13349.5.patch, HIVE-13349.6.patch, HIVE-13349.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-13548) hive-jdbc isn't escaping slashes during PreparedStatement

2016-04-19 Thread Nasron Cheong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nasron Cheong reassigned HIVE-13548:


Assignee: Vaibhav Gumashta  (was: Nasron Cheong)

> hive-jdbc isn't escaping slashes during PreparedStatement
> -
>
> Key: HIVE-13548
> URL: https://issues.apache.org/jira/browse/HIVE-13548
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Nasron Cheong
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-13548.patch
>
>
> Calling setString on a prepared statement with a string containing a '\' will 
> cause the SQL construction to fail.
> I believe the slash should be escaped by the setString function.
> There may be other characters that require escaping during the same call.
> Failure from the unittest without the patch:
> {code}
> Running org.apache.hive.jdbc.TestJdbcDriver2
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.738 sec <<< 
> FAILURE! - in org.apache.hive.jdbc.TestJdbcDriver2
> testSlashPreparedStatement(org.apache.hive.jdbc.TestJdbcDriver2)  Time 
> elapsed: 3.867 sec  <<< FAILURE!
> java.lang.AssertionError: java.lang.StringIndexOutOfBoundsException: String 
> index out of range: -1
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hive.jdbc.TestJdbcDriver2.testSlashPreparedStatement(TestJdbcDriver2.java:522)
> Results :
> Failed tests: 
>   TestJdbcDriver2.testSlashPreparedStatement:522 
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13547) Switch Hive2 tests to JDK8

2016-04-19 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248309#comment-15248309
 ] 

Mohit Sabharwal commented on HIVE-13547:


makes sense, created HIVE-13549

> Switch Hive2 tests to JDK8
> --
>
> Key: HIVE-13547
> URL: https://issues.apache.org/jira/browse/HIVE-13547
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>
> Tracks switching Hive2 tests to JDK8 as discussed on dev list:
> http://mail-archives.apache.org/mod_mbox/hive-dev/201604.mbox/%3ccabjbx5mckfczpxpkr9kvkbaenhwvymbu_mwyfmgfs9snzni...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13480) Add hadoop2 metrics reporter for Codahale metrics

2016-04-19 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13480:

Attachment: HIVE-13480.3.patch

One more minor change to the HiveConf description.

> Add hadoop2 metrics reporter for Codahale metrics
> -
>
> Key: HIVE-13480
> URL: https://issues.apache.org/jira/browse/HIVE-13480
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13480.2.patch, HIVE-13480.3.patch, HIVE-13480.patch
>
>
> Multiple other apache components allow sending metrics over to Hadoop2 
> metrics, which allow for monitoring solutions like Ambari Metrics Server to 
> work against that to show metrics for components in one place. Our Codahale 
> metrics works very well, so ideally, we would like to bridge the two, to 
> allow Codahale to add a Hadoop2 reporter that enables us to continue to use 
> Codahale metrics (i.e. not write another custom metrics impl) but report 
> using Hadoop2.
> Apache Phoenix also had such a recent usecase and were in the process of 
> adding in a stub piece that allows this forwarding. We should use the same 
> reporter to minimize redundancy while pushing metrics to a centralized 
> solution like Hadoop2 Metrics/AMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13547) Switch Hive2 tests to JDK8

2016-04-19 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248283#comment-15248283
 ] 

Prasanth Jayachandran commented on HIVE-13547:
--

It will also be better to remove different out files (one for jdk7 and another 
for jdk8) as part of this jira/follow up/sub-task.

Example: 
https://github.com/apache/hive/blob/master/ql/src/test/results/clientpositive/join0.q.java1.7.out
https://github.com/apache/hive/blob/master/ql/src/test/results/clientpositive/join0.q.java1.8.out

These 2 files exists because of hashmap ordering differences between these two 
java versions.

> Switch Hive2 tests to JDK8
> --
>
> Key: HIVE-13547
> URL: https://issues.apache.org/jira/browse/HIVE-13547
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>
> Tracks switching Hive2 tests to JDK8 as discussed on dev list:
> http://mail-archives.apache.org/mod_mbox/hive-dev/201604.mbox/%3ccabjbx5mckfczpxpkr9kvkbaenhwvymbu_mwyfmgfs9snzni...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13533) Remove AST dump

2016-04-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248276#comment-15248276
 ] 

Ashutosh Chauhan commented on HIVE-13533:
-

+1 pending tests
This likely will require golden file updates to many files.

> Remove AST dump
> ---
>
> Key: HIVE-13533
> URL: https://issues.apache.org/jira/browse/HIVE-13533
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13533.patch
>
>
> For very large queries, dumping the AST can lead to OOM errors. Currently 
> there are two places where we dump the AST:
> - CalcitePlanner if we are running in DEBUG mode (line 300).
> - ExplainTask if we use extended explain (line 179).
> I guess the original reason to add the dump was to check whether the AST 
> conversion from CBO was working properly, but I think we are past that stage 
> now.
> We will remove the logic to dump the AST in explain extended. For debug mode 
> in CalcitePlanner, we will lower the level to LOG.TRACE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13548) hive-jdbc isn't escaping slashes during PreparedStatement

2016-04-19 Thread Nasron Cheong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nasron Cheong updated HIVE-13548:
-
Description: 
Calling setString on a prepared statement with a string containing a '\' will 
cause the SQL construction to fail.

I believe the slash should be escaped by the setString function.

There may be other characters that require escaping during the same call.

Failure from the unittest without the patch:

{code}
Running org.apache.hive.jdbc.TestJdbcDriver2
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.738 sec <<< 
FAILURE! - in org.apache.hive.jdbc.TestJdbcDriver2
testSlashPreparedStatement(org.apache.hive.jdbc.TestJdbcDriver2)  Time elapsed: 
3.867 sec  <<< FAILURE!
java.lang.AssertionError: java.lang.StringIndexOutOfBoundsException: String 
index out of range: -1
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hive.jdbc.TestJdbcDriver2.testSlashPreparedStatement(TestJdbcDriver2.java:522)


Results :

Failed tests: 
  TestJdbcDriver2.testSlashPreparedStatement:522 
java.lang.StringIndexOutOfBoundsException: String index out of range: -1

Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
{code}

  was:
Calling setString on a prepared statement with a string containing a '\' will 
cause the SQL construction to fail.

I believe the slash should be escaped by the setString function.

There may be other characters that require escaping during the same call.


> hive-jdbc isn't escaping slashes during PreparedStatement
> -
>
> Key: HIVE-13548
> URL: https://issues.apache.org/jira/browse/HIVE-13548
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Nasron Cheong
>Assignee: Nasron Cheong
> Attachments: HIVE-13548.patch
>
>
> Calling setString on a prepared statement with a string containing a '\' will 
> cause the SQL construction to fail.
> I believe the slash should be escaped by the setString function.
> There may be other characters that require escaping during the same call.
> Failure from the unittest without the patch:
> {code}
> Running org.apache.hive.jdbc.TestJdbcDriver2
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.738 sec <<< 
> FAILURE! - in org.apache.hive.jdbc.TestJdbcDriver2
> testSlashPreparedStatement(org.apache.hive.jdbc.TestJdbcDriver2)  Time 
> elapsed: 3.867 sec  <<< FAILURE!
> java.lang.AssertionError: java.lang.StringIndexOutOfBoundsException: String 
> index out of range: -1
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hive.jdbc.TestJdbcDriver2.testSlashPreparedStatement(TestJdbcDriver2.java:522)
> Results :
> Failed tests: 
>   TestJdbcDriver2.testSlashPreparedStatement:522 
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13493) Fix TransactionBatchImpl.getCurrentTxnId() and mis logging fixes

2016-04-19 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248267#comment-15248267
 ] 

Wei Zheng commented on HIVE-13493:
--

Looks good. +1

> Fix TransactionBatchImpl.getCurrentTxnId() and mis logging fixes
> 
>
> Key: HIVE-13493
> URL: https://issues.apache.org/jira/browse/HIVE-13493
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-13493.patch
>
>
> sort list of transaction IDs deleted by performTimeouts
> sort list of "empty aborted"
> log the list of lock id removed due to timeout
> fix TransactionBatchImpl.getCurrentTxnId() not to look past end of array (see 
> HIVE-13489)
> beginNextTransactionImpl()
> if ( currentTxnIndex >= txnIds.size() )//todo: this condition is bogus should 
> check currentTxnIndex + 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13548) hive-jdbc isn't escaping slashes during PreparedStatement

2016-04-19 Thread Nasron Cheong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nasron Cheong updated HIVE-13548:
-
Status: Patch Available  (was: Open)

Escaping '\' during prepared statement setString

> hive-jdbc isn't escaping slashes during PreparedStatement
> -
>
> Key: HIVE-13548
> URL: https://issues.apache.org/jira/browse/HIVE-13548
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Nasron Cheong
>Assignee: Nasron Cheong
> Attachments: HIVE-13548.patch
>
>
> Calling setString on a prepared statement with a string containing a '\' will 
> cause the SQL construction to fail.
> I believe the slash should be escaped by the setString function.
> There may be other characters that require escaping during the same call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13548) hive-jdbc isn't escaping slashes during PreparedStatement

2016-04-19 Thread Nasron Cheong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nasron Cheong updated HIVE-13548:
-
Attachment: HIVE-13548.patch

> hive-jdbc isn't escaping slashes during PreparedStatement
> -
>
> Key: HIVE-13548
> URL: https://issues.apache.org/jira/browse/HIVE-13548
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Nasron Cheong
>Assignee: Nasron Cheong
> Attachments: HIVE-13548.patch
>
>
> Calling setString on a prepared statement with a string containing a '\' will 
> cause the SQL construction to fail.
> I believe the slash should be escaped by the setString function.
> There may be other characters that require escaping during the same call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13409) Fix JDK8 test failures related to COLUMN_STATS_ACCURATE

2016-04-19 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13409:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-13547

> Fix JDK8 test failures related to COLUMN_STATS_ACCURATE
> ---
>
> Key: HIVE-13409
> URL: https://issues.apache.org/jira/browse/HIVE-13409
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>
> 126 failures have crept into JDK8 tests since we resolved HIVE-8607
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-JAVA8/
> Majority relate to the ordering of a "COLUMN_STATS_ACCURATE" partition 
> property.
> Looks like a simple fix, use ordered map in 
> HiveStringUtils.getPropertiesExplain()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13480) Add hadoop2 metrics reporter for Codahale metrics

2016-04-19 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13480:

Attachment: HIVE-13480.2.patch

Uploading updated patch with version change per [~szehon]. (Thought I'd 
uploaded this last week, but don't see my previous upload.)

> Add hadoop2 metrics reporter for Codahale metrics
> -
>
> Key: HIVE-13480
> URL: https://issues.apache.org/jira/browse/HIVE-13480
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13480.2.patch, HIVE-13480.patch
>
>
> Multiple other apache components allow sending metrics over to Hadoop2 
> metrics, which allow for monitoring solutions like Ambari Metrics Server to 
> work against that to show metrics for components in one place. Our Codahale 
> metrics works very well, so ideally, we would like to bridge the two, to 
> allow Codahale to add a Hadoop2 reporter that enables us to continue to use 
> Codahale metrics (i.e. not write another custom metrics impl) but report 
> using Hadoop2.
> Apache Phoenix also had such a recent usecase and were in the process of 
> adding in a stub piece that allows this forwarding. We should use the same 
> reporter to minimize redundancy while pushing metrics to a centralized 
> solution like Hadoop2 Metrics/AMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information

2016-04-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248179#comment-15248179
 ] 

Ashutosh Chauhan commented on HIVE-13349:
-

[~hsubramaniyan] If you can't repro reported failures locally, then please go 
ahead and commit this. Queue on build machine is too long.

> Metastore Changes : API calls for retrieving primary keys and foreign keys 
> information
> --
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: 13449.2.patch, HIVE-13349.1.patch, HIVE-13349.3.patch, 
> HIVE-13349.4.patch, HIVE-13349.5.patch, HIVE-13349.6.patch, HIVE-13349.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10293) enabling travis-ci build?

2016-04-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248168#comment-15248168
 ] 

Ashutosh Chauhan commented on HIVE-10293:
-

[~spena] , Do you know which mvn version we are using on pTest2 server?

> enabling travis-ci build?
> -
>
> Key: HIVE-10293
> URL: https://issues.apache.org/jira/browse/HIVE-10293
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Gabor Liptak
>Assignee: Gabor Liptak
>Priority: Minor
> Attachments: HIVE-10293.1.patch, HIVE-10293.2.diff
>
>
> I would like to contribute a .travis.yml for Hive.
> In particular, this would allow contributors working through Github, to 
> validate their own commits on their own branches.
> Please comment.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13357) TxnHandler.checkQFileTestHack() should not call TxnDbUtil.setConfValues()

2016-04-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13357:
--
Target Version/s: 1.3.0, 2.1.0
   Fix Version/s: (was: 2.1.0)
  (was: 1.3.0)

> TxnHandler.checkQFileTestHack() should not call TxnDbUtil.setConfValues()
> -
>
> Key: HIVE-13357
> URL: https://issues.apache.org/jira/browse/HIVE-13357
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> this should be client side (test side) responsibility
> as is, it can sometimes clobber settings made by test client
> Longer term we should try not calling TxnDbUtil.prepDb(); from TxnHandler 
> either.
> Can probably create a UDF to run this so that Q file tests can init the tables
> See if this is even necessary - all TXN tables are part main Derby .sql init 
> file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13542) Missing stats for tables in TPCDS performance regression suite

2016-04-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248021#comment-15248021
 ] 

Jesus Camacho Rodriguez commented on HIVE-13542:


[~hsubramaniyan], still same problem. Can be reproduced running e.g. query7.q.

> Missing stats for tables in TPCDS performance regression suite
> --
>
> Key: HIVE-13542
> URL: https://issues.apache.org/jira/browse/HIVE-13542
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13542.1.patch
>
>
> These are the tables whose stats are missing in 
> data/files/tpcds-perf/metastore_export/csv/TAB_COL_STATS.txt:
> * catalog_returns
> * catalog_sales
> * inventory
> * store_returns
> * store_sales
> * web_returns
> * web_sales
> Thanks to [~jcamachorodriguez] for discovering this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13542) Missing stats for tables in TPCDS performance regression suite

2016-04-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247977#comment-15247977
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-13542:
--

[~jcamachorodriguez] Thanks.
It seems there is a corrupt entry in TAB_COL_STATS.txt, can you please replace 
the line 101 (one with cd_demo_sk) to 

{code}
default,customer_demographics,cd_demo_sk,int,1,1920800,1835839,0,1434571729,6296,_customer_demographics_
{code}

and see if it resolves the issue.

Thanks
Hari


> Missing stats for tables in TPCDS performance regression suite
> --
>
> Key: HIVE-13542
> URL: https://issues.apache.org/jira/browse/HIVE-13542
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13542.1.patch
>
>
> These are the tables whose stats are missing in 
> data/files/tpcds-perf/metastore_export/csv/TAB_COL_STATS.txt:
> * catalog_returns
> * catalog_sales
> * inventory
> * store_returns
> * store_sales
> * web_returns
> * web_sales
> Thanks to [~jcamachorodriguez] for discovering this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13542) Missing stats for tables in TPCDS performance regression suite

2016-04-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247892#comment-15247892
 ] 

Jesus Camacho Rodriguez commented on HIVE-13542:


Thanks [~hsubramaniyan].

I have been running some tests using your patch e.g. query7.q. Examining the 
logs I see the following message that seems to indicate we still have some kind 
of problem with the stats:

{noformat}
Caused by: org.apache.hadoop.hive.metastore.DeadlineException: The threadlocal 
Deadline is null, please register it first.
at 
org.apache.hadoop.hive.metastore.Deadline.checkTimeout(Deadline.java:149) 
~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$7.getJdoResult(ObjectStore.java:6686)
 ~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$7.getJdoResult(ObjectStore.java:6664)
 ~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2550)
 ~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatisticsInternal(ObjectStore.java:6663)
 ~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatistics(ObjectStore.java:6657)
 ~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_statistics_req(HiveMetaStore.java:4327)
 ~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTableColumnStatistics(HiveMetaStoreClient.java:1570)
 ~[hive-metastore-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTableColumnStatistics(SessionHiveMetaStoreClient.java:347)
 ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.Hive.getTableColumnStatistics(Hive.java:3317)
 ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
... 178 more
2016-04-19T07:23:58,718 ERROR [ee2214e0-ff62-4617-a51c-45dfba3ae1c0 main[]]: 
calcite.RelOptHiveTable (RelOptHiveTable.java:updateColStats(395)) - No Stats 
for default@customer_demographics, Columns: cd_demo_sk
2016-04-19T07:23:58,719 WARN  [ee2214e0-ff62-4617-a51c-45dfba3ae1c0 main[]]: 
parse.CalcitePlanner (CalcitePlanner.java:apply(993)) - Missing column stats 
(see previous messages), skipping join reordering in CBO
{noformat}

In particular, that column is present in 
{{data/files/tpcds-perf/metastore_export/csv/TAB_COL_STATS.txt}}.

I could avoid the DeadlineException by setting {{hive.metastore.fastpath}} to 
false in {{data/conf/perf-reg/hive-site.xml}}, thus avoiding bypassing the raw 
store proxy as we are using the ObjectStore in the PerfCliDriver. However, even 
after doing that, I still see the missing stats message.

> Missing stats for tables in TPCDS performance regression suite
> --
>
> Key: HIVE-13542
> URL: https://issues.apache.org/jira/browse/HIVE-13542
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13542.1.patch
>
>
> These are the tables whose stats are missing in 
> data/files/tpcds-perf/metastore_export/csv/TAB_COL_STATS.txt:
> * catalog_returns
> * catalog_sales
> * inventory
> * store_returns
> * store_sales
> * web_returns
> * web_sales
> Thanks to [~jcamachorodriguez] for discovering this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9862) Vectorized execution corrupts timestamp values

2016-04-19 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247486#comment-15247486
 ] 

Matt McCline commented on HIVE-9862:


Also committed to branch-1 with HIVE-13111.

> Vectorized execution corrupts timestamp values
> --
>
> Key: HIVE-9862
> URL: https://issues.apache.org/jira/browse/HIVE-9862
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 1.0.0
>Reporter: Nathan Howell
>Assignee: Matt McCline
> Fix For: 1.3.0, 2.1.0, 2.0.1
>
> Attachments: HIVE-9862.01.patch, HIVE-9862.02.patch, 
> HIVE-9862.03.patch, HIVE-9862.04.patch, HIVE-9862.05.patch, 
> HIVE-9862.06.patch, HIVE-9862.07.patch, HIVE-9862.08.patch, HIVE-9862.09.patch
>
>
> Timestamps in the future (year 2250?) and before ~1700 are silently corrupted 
> in vectorized execution mode. Simple repro:
> {code}
> hive> DROP TABLE IF EXISTS test;
> hive> CREATE TABLE test(ts TIMESTAMP) STORED AS ORC;
> hive> INSERT INTO TABLE test VALUES ('-12-31 23:59:59');
> hive> SET hive.vectorized.execution.enabled = false;
> hive> SELECT MAX(ts) FROM test;
> -12-31 23:59:59
> hive> SET hive.vectorized.execution.enabled = true;
> hive> SELECT MAX(ts) FROM test;
> 1816-03-30 05:56:07.066277376
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9862) Vectorized execution corrupts timestamp values

2016-04-19 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9862:
---
Fix Version/s: 1.3.0

> Vectorized execution corrupts timestamp values
> --
>
> Key: HIVE-9862
> URL: https://issues.apache.org/jira/browse/HIVE-9862
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 1.0.0
>Reporter: Nathan Howell
>Assignee: Matt McCline
> Fix For: 1.3.0, 2.1.0, 2.0.1
>
> Attachments: HIVE-9862.01.patch, HIVE-9862.02.patch, 
> HIVE-9862.03.patch, HIVE-9862.04.patch, HIVE-9862.05.patch, 
> HIVE-9862.06.patch, HIVE-9862.07.patch, HIVE-9862.08.patch, HIVE-9862.09.patch
>
>
> Timestamps in the future (year 2250?) and before ~1700 are silently corrupted 
> in vectorized execution mode. Simple repro:
> {code}
> hive> DROP TABLE IF EXISTS test;
> hive> CREATE TABLE test(ts TIMESTAMP) STORED AS ORC;
> hive> INSERT INTO TABLE test VALUES ('-12-31 23:59:59');
> hive> SET hive.vectorized.execution.enabled = false;
> hive> SELECT MAX(ts) FROM test;
> -12-31 23:59:59
> hive> SET hive.vectorized.execution.enabled = true;
> hive> SELECT MAX(ts) FROM test;
> 1816-03-30 05:56:07.066277376
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13111) Fix timestamp / interval_day_time wrong results with HIVE-9862

2016-04-19 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247484#comment-15247484
 ] 

Matt McCline commented on HIVE-13111:
-

Also, committed to branch-1 with HIVE-9862.

> Fix timestamp / interval_day_time wrong results with HIVE-9862 
> ---
>
> Key: HIVE-13111
> URL: https://issues.apache.org/jira/browse/HIVE-13111
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0, 2.0.1
>
> Attachments: HIVE-13111.01.patch, HIVE-13111.02.patch, 
> HIVE-13111.03.patch, HIVE-13111.04.patch, HIVE-13111.05.patch, 
> HIVE-13111.06.patch, HIVE-13111.07.patch
>
>
> Fix timestamp / interval_day_time issues discovered when testing the 
> Vectorized Text patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13111) Fix timestamp / interval_day_time wrong results with HIVE-9862

2016-04-19 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13111:

Fix Version/s: 1.3.0

> Fix timestamp / interval_day_time wrong results with HIVE-9862 
> ---
>
> Key: HIVE-13111
> URL: https://issues.apache.org/jira/browse/HIVE-13111
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0, 2.0.1
>
> Attachments: HIVE-13111.01.patch, HIVE-13111.02.patch, 
> HIVE-13111.03.patch, HIVE-13111.04.patch, HIVE-13111.05.patch, 
> HIVE-13111.06.patch, HIVE-13111.07.patch
>
>
> Fix timestamp / interval_day_time issues discovered when testing the 
> Vectorized Text patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13545) Add GLOBAL Type to Entity

2016-04-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247367#comment-15247367
 ] 

ASF GitHub Bot commented on HIVE-13545:
---

GitHub user sundapeng opened a pull request:

https://github.com/apache/hive/pull/73

HIVE-13545: Add GLOBAL Type to Entity



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sundapeng/hive HIVE-13545

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/73.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #73


commit 8246a01036460b55eca85abd47eb0b158b418fab
Author: Sun Dapeng 
Date:   2016-04-19T07:30:50Z

HIVE-13545: Add GLOBAL Type to Entity




> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-13545.001.patch
>
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13545) Add GLOBAL Type to Entity

2016-04-19 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-13545:
--
Attachment: HIVE-13545.001.patch

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-13545.001.patch
>
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13545) Add GLOBAL Type to Entity

2016-04-19 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-13545:
--
Status: Patch Available  (was: Open)

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13545) Add GLOBAL Type to Entity

2016-04-19 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-13545:
--
Description: {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} 
don't have the {{GLOBAL}} type, it should be matched with 
{{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
 At the same time, we should enable the custom converting from Entity to 
HivePrivilegeObject  (was: 
{{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
{{GLOBAL}} type, it should be matched with 
{{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}})

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13421) Propagate job progress in operation status

2016-04-19 Thread Rajat Khandelwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247351#comment-15247351
 ] 

Rajat Khandelwal commented on HIVE-13421:
-

Taking patch from reviewboard and attaching

> Propagate job progress in operation status
> --
>
> Key: HIVE-13421
> URL: https://issues.apache.org/jira/browse/HIVE-13421
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-13421.01.patch, HIVE-13421.02.patch, 
> HIVE-13421.03.patch, HIVE-13421.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13421) Propagate job progress in operation status

2016-04-19 Thread Rajat Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Khandelwal updated HIVE-13421:

Status: Patch Available  (was: In Progress)

> Propagate job progress in operation status
> --
>
> Key: HIVE-13421
> URL: https://issues.apache.org/jira/browse/HIVE-13421
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-13421.01.patch, HIVE-13421.02.patch, 
> HIVE-13421.03.patch, HIVE-13421.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13421) Propagate job progress in operation status

2016-04-19 Thread Rajat Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Khandelwal updated HIVE-13421:

Attachment: HIVE-13421.04.patch

> Propagate job progress in operation status
> --
>
> Key: HIVE-13421
> URL: https://issues.apache.org/jira/browse/HIVE-13421
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-13421.01.patch, HIVE-13421.02.patch, 
> HIVE-13421.03.patch, HIVE-13421.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-13421) Propagate job progress in operation status

2016-04-19 Thread Rajat Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13421 started by Rajat Khandelwal.
---
> Propagate job progress in operation status
> --
>
> Key: HIVE-13421
> URL: https://issues.apache.org/jira/browse/HIVE-13421
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-13421.01.patch, HIVE-13421.02.patch, 
> HIVE-13421.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13421) Propagate job progress in operation status

2016-04-19 Thread Rajat Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Khandelwal updated HIVE-13421:

Status: Open  (was: Patch Available)

Cancelling patch. Will resubmit so that pre-commit job can run. 

> Propagate job progress in operation status
> --
>
> Key: HIVE-13421
> URL: https://issues.apache.org/jira/browse/HIVE-13421
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-13421.01.patch, HIVE-13421.02.patch, 
> HIVE-13421.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13546) Patch for HIVE-12893 is broken in branch-1

2016-04-19 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247342#comment-15247342
 ] 

Nemon Lou commented on HIVE-13546:
--

After reverting HIVE-12893 or applying the patch above ,this issue get fixed.

> Patch for HIVE-12893 is broken in branch-1 
> ---
>
> Key: HIVE-13546
> URL: https://issues.apache.org/jira/browse/HIVE-13546
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
> Attachments: HIVE-13546.patch
>
>
> The following sql fails:
> {noformat}
> set hive.map.aggr=true;
> set mapreduce.reduce.speculative=false;
> set hive.auto.convert.join=true;
> set hive.optimize.reducededuplication = false;
> set hive.optimize.reducededuplication.min.reducer=1;
> set hive.optimize.mapjoin.mapreduce=true;
> set hive.stats.autogather=true;
> set mapred.reduce.parallel.copies=30;
> set mapred.job.shuffle.input.buffer.percent=0.5;
> set mapred.job.reduce.input.buffer.percent=0.2;
> set mapred.map.child.java.opts=-server -Xmx2800m 
> -Djava.net.preferIPv4Stack=true;
> set mapred.reduce.child.java.opts=-server -Xmx3800m 
> -Djava.net.preferIPv4Stack=true;
> set mapreduce.map.memory.mb=3072;
> set mapreduce.reduce.memory.mb=4096;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions.pernode=10;
> set hive.exec.max.dynamic.partitions=10;
> set hive.exec.max.created.files=100;
> set hive.exec.parallel=true;
> set hive.exec.reducers.max=2000;
> set hive.stats.autogather=true;
> set hive.optimize.sort.dynamic.partition=true;
> set mapred.job.reduce.input.buffer.percent=0.0;
> set mapreduce.input.fileinputformat.split.minsizee=24000;
> set mapreduce.input.fileinputformat.split.minsize.per.node=24000;
> set mapreduce.input.fileinputformat.split.minsize.per.rack=24000;
> set hive.optimize.sort.dynamic.partition=true;
> use tpcds_bin_partitioned_orc_4;
> insert overwrite table store_sales partition (ss_sold_date_sk)
> select
> ss.ss_sold_time_sk,
> ss.ss_item_sk,
> ss.ss_customer_sk,
> ss.ss_cdemo_sk,
> ss.ss_hdemo_sk,
> ss.ss_addr_sk,
> ss.ss_store_sk,
> ss.ss_promo_sk,
> ss.ss_ticket_number,
> ss.ss_quantity,
> ss.ss_wholesale_cost,
> ss.ss_list_price,
> ss.ss_sales_price,
> ss.ss_ext_discount_amt,
> ss.ss_ext_sales_price,
> ss.ss_ext_wholesale_cost,
> ss.ss_ext_list_price,
> ss.ss_ext_tax,
> ss.ss_coupon_amt,
> ss.ss_net_paid,
> ss.ss_net_paid_inc_tax,
> ss.ss_net_profit,
> ss.ss_sold_date_sk
>   from tpcds_text_4.store_sales ss;
> {noformat}
> Error log is as follows
> {noformat}
> 2016-04-19 15:15:35,252 FATAL [main] ExecReducer: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":null},"value":{"_col0":null,"_col1":5588,"_col2":170300,"_col3":null,"_col4":756,"_col5":91384,"_col6":16,"_col7":null,"_col8":855582,"_col9":28,"_col10":null,"_col11":48.83,"_col12":null,"_col13":0.0,"_col14":null,"_col15":899.64,"_col16":null,"_col17":6.14,"_col18":0.0,"_col19":null,"_col20":null,"_col21":null,"_col22":null}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:180)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1732)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:174)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653)
>   at java.util.ArrayList.get(ArrayList.java:429)
>   at 
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:151)
>   at 
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:131)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynPartDirectory(FileSinkOperator.java:1003)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:713)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
>   ... 7 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13546) Patch for HIVE-12893 is broken in branch-1

2016-04-19 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247335#comment-15247335
 ] 

Nemon Lou commented on HIVE-13546:
--

[~prasanth_j] would you take a look ? Seems that the patch for HIVE-12893 is 
different between master and branch-1

> Patch for HIVE-12893 is broken in branch-1 
> ---
>
> Key: HIVE-13546
> URL: https://issues.apache.org/jira/browse/HIVE-13546
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
> Attachments: HIVE-13546.patch
>
>
> The following sql fails:
> {noformat}
> set hive.map.aggr=true;
> set mapreduce.reduce.speculative=false;
> set hive.auto.convert.join=true;
> set hive.optimize.reducededuplication = false;
> set hive.optimize.reducededuplication.min.reducer=1;
> set hive.optimize.mapjoin.mapreduce=true;
> set hive.stats.autogather=true;
> set mapred.reduce.parallel.copies=30;
> set mapred.job.shuffle.input.buffer.percent=0.5;
> set mapred.job.reduce.input.buffer.percent=0.2;
> set mapred.map.child.java.opts=-server -Xmx2800m 
> -Djava.net.preferIPv4Stack=true;
> set mapred.reduce.child.java.opts=-server -Xmx3800m 
> -Djava.net.preferIPv4Stack=true;
> set mapreduce.map.memory.mb=3072;
> set mapreduce.reduce.memory.mb=4096;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions.pernode=10;
> set hive.exec.max.dynamic.partitions=10;
> set hive.exec.max.created.files=100;
> set hive.exec.parallel=true;
> set hive.exec.reducers.max=2000;
> set hive.stats.autogather=true;
> set hive.optimize.sort.dynamic.partition=true;
> set mapred.job.reduce.input.buffer.percent=0.0;
> set mapreduce.input.fileinputformat.split.minsizee=24000;
> set mapreduce.input.fileinputformat.split.minsize.per.node=24000;
> set mapreduce.input.fileinputformat.split.minsize.per.rack=24000;
> set hive.optimize.sort.dynamic.partition=true;
> use tpcds_bin_partitioned_orc_4;
> insert overwrite table store_sales partition (ss_sold_date_sk)
> select
> ss.ss_sold_time_sk,
> ss.ss_item_sk,
> ss.ss_customer_sk,
> ss.ss_cdemo_sk,
> ss.ss_hdemo_sk,
> ss.ss_addr_sk,
> ss.ss_store_sk,
> ss.ss_promo_sk,
> ss.ss_ticket_number,
> ss.ss_quantity,
> ss.ss_wholesale_cost,
> ss.ss_list_price,
> ss.ss_sales_price,
> ss.ss_ext_discount_amt,
> ss.ss_ext_sales_price,
> ss.ss_ext_wholesale_cost,
> ss.ss_ext_list_price,
> ss.ss_ext_tax,
> ss.ss_coupon_amt,
> ss.ss_net_paid,
> ss.ss_net_paid_inc_tax,
> ss.ss_net_profit,
> ss.ss_sold_date_sk
>   from tpcds_text_4.store_sales ss;
> {noformat}
> Error log is as follows
> {noformat}
> 2016-04-19 15:15:35,252 FATAL [main] ExecReducer: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":null},"value":{"_col0":null,"_col1":5588,"_col2":170300,"_col3":null,"_col4":756,"_col5":91384,"_col6":16,"_col7":null,"_col8":855582,"_col9":28,"_col10":null,"_col11":48.83,"_col12":null,"_col13":0.0,"_col14":null,"_col15":899.64,"_col16":null,"_col17":6.14,"_col18":0.0,"_col19":null,"_col20":null,"_col21":null,"_col22":null}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:180)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1732)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:174)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653)
>   at java.util.ArrayList.get(ArrayList.java:429)
>   at 
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:151)
>   at 
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:131)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynPartDirectory(FileSinkOperator.java:1003)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:713)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
>   ... 7 more
> {noformat}



--
This message

[jira] [Updated] (HIVE-13546) Patch for HIVE-12893 is broken in branch-1

2016-04-19 Thread Nemon Lou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-13546:
-
Attachment: HIVE-13546.patch

> Patch for HIVE-12893 is broken in branch-1 
> ---
>
> Key: HIVE-13546
> URL: https://issues.apache.org/jira/browse/HIVE-13546
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
> Attachments: HIVE-13546.patch
>
>
> The following sql fails:
> {noformat}
> set hive.map.aggr=true;
> set mapreduce.reduce.speculative=false;
> set hive.auto.convert.join=true;
> set hive.optimize.reducededuplication = false;
> set hive.optimize.reducededuplication.min.reducer=1;
> set hive.optimize.mapjoin.mapreduce=true;
> set hive.stats.autogather=true;
> set mapred.reduce.parallel.copies=30;
> set mapred.job.shuffle.input.buffer.percent=0.5;
> set mapred.job.reduce.input.buffer.percent=0.2;
> set mapred.map.child.java.opts=-server -Xmx2800m 
> -Djava.net.preferIPv4Stack=true;
> set mapred.reduce.child.java.opts=-server -Xmx3800m 
> -Djava.net.preferIPv4Stack=true;
> set mapreduce.map.memory.mb=3072;
> set mapreduce.reduce.memory.mb=4096;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions.pernode=10;
> set hive.exec.max.dynamic.partitions=10;
> set hive.exec.max.created.files=100;
> set hive.exec.parallel=true;
> set hive.exec.reducers.max=2000;
> set hive.stats.autogather=true;
> set hive.optimize.sort.dynamic.partition=true;
> set mapred.job.reduce.input.buffer.percent=0.0;
> set mapreduce.input.fileinputformat.split.minsizee=24000;
> set mapreduce.input.fileinputformat.split.minsize.per.node=24000;
> set mapreduce.input.fileinputformat.split.minsize.per.rack=24000;
> set hive.optimize.sort.dynamic.partition=true;
> use tpcds_bin_partitioned_orc_4;
> insert overwrite table store_sales partition (ss_sold_date_sk)
> select
> ss.ss_sold_time_sk,
> ss.ss_item_sk,
> ss.ss_customer_sk,
> ss.ss_cdemo_sk,
> ss.ss_hdemo_sk,
> ss.ss_addr_sk,
> ss.ss_store_sk,
> ss.ss_promo_sk,
> ss.ss_ticket_number,
> ss.ss_quantity,
> ss.ss_wholesale_cost,
> ss.ss_list_price,
> ss.ss_sales_price,
> ss.ss_ext_discount_amt,
> ss.ss_ext_sales_price,
> ss.ss_ext_wholesale_cost,
> ss.ss_ext_list_price,
> ss.ss_ext_tax,
> ss.ss_coupon_amt,
> ss.ss_net_paid,
> ss.ss_net_paid_inc_tax,
> ss.ss_net_profit,
> ss.ss_sold_date_sk
>   from tpcds_text_4.store_sales ss;
> {noformat}
> Error log is as follows
> {noformat}
> 2016-04-19 15:15:35,252 FATAL [main] ExecReducer: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":null},"value":{"_col0":null,"_col1":5588,"_col2":170300,"_col3":null,"_col4":756,"_col5":91384,"_col6":16,"_col7":null,"_col8":855582,"_col9":28,"_col10":null,"_col11":48.83,"_col12":null,"_col13":0.0,"_col14":null,"_col15":899.64,"_col16":null,"_col17":6.14,"_col18":0.0,"_col19":null,"_col20":null,"_col21":null,"_col22":null}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:180)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1732)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:174)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653)
>   at java.util.ArrayList.get(ArrayList.java:429)
>   at 
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:151)
>   at 
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:131)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynPartDirectory(FileSinkOperator.java:1003)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:713)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
>   ... 7 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13540) Casts to numeric types don't seem to work in hplsql

2016-04-19 Thread Dmitry Tolpeko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247311#comment-15247311
 ] 

Dmitry Tolpeko commented on HIVE-13540:
---

I am working on this. 

> Casts to numeric types don't seem to work in hplsql
> ---
>
> Key: HIVE-13540
> URL: https://issues.apache.org/jira/browse/HIVE-13540
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
>
> Maybe I'm doing this wrong? But it seems to be broken.
> Casts to string types seem to work fine, but not numbers.
> This code:
> {code}
> temp_int = CAST('1' AS int);
> print temp_int
> temp_float   = CAST('1.2' AS float);
> print temp_float
> temp_double  = CAST('1.2' AS double);
> print temp_double
> temp_decimal = CAST('1.2' AS decimal(10, 4));
> print temp_decimal
> temp_string = CAST('1.2' AS string);
> print temp_string
> {code}
> Produces this output:
> {code}
> [vagrant@hdp250 hplsql]$ hplsql -f temp2.hplsql
> which: no hbase in 
> (/usr/lib64/qt-3.3/bin:/usr/lib/jvm/java/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/puppetlabs/bin:/usr/local/share/jmeter/bin:/home/vagrant/bin)
> WARNING: Use "yarn jar" to launch YARN applications.
> null
> null
> null
> null
> 1.2
> {code}
> The software I'm using is not anything released but is pretty close to the 
> trunk, 2 weeks old at most.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13543) HiveServer2: Add unit tests for testing configurable LDAP user key name, group membership key name and group class name

2016-04-19 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13543:

Affects Version/s: (was: 2.1.0)
   1.2.1
   2.0.0

> HiveServer2: Add unit tests for testing configurable LDAP user key name, 
> group membership key name and group class name  
> -
>
> Key: HIVE-13543
> URL: https://issues.apache.org/jira/browse/HIVE-13543
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> HIVE-13295 made the above mentioned properties configurable. We should add 
> unit tests for these.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

76 matches

Mail list logo