[jira] [Created] (HIVE-11021) ObjectStore should call closeAll() on JDO query object to release the resources

2015-06-16 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-11021:
---

 Summary: ObjectStore should call closeAll() on JDO query object to 
release the resources
 Key: HIVE-11021
 URL: https://issues.apache.org/jira/browse/HIVE-11021
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


In ObjectStore class, in getMDatabase() and getMTable(), after retrieving the 
database and table info from the database, we should call closeAll() on JDO 
query to release the resource. It would cause the cursor leaking on the 
database otherwise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11023) Disable directSQL if datanucleus.identifierFactory = datanucleus2

2015-06-16 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-11023:
---

 Summary: Disable directSQL if datanucleus.identifierFactory = 
datanucleus2
 Key: HIVE-11023
 URL: https://issues.apache.org/jira/browse/HIVE-11023
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.3.0, 1.2.1, 2.0.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


We hit an interesting bug in a case where datanucleus.identifierFactory = 
datanucleus2 .

The problem is that directSql handgenerates SQL strings assuming datanucleus1 
naming scheme. If a user has their metastore JDO managed by 
datanucleus.identifierFactory = datanucleus2 , the SQL strings we generate are 
incorrect.

One simple example of what this results in is the following: whenever DN 
persists a field which is held as a ListT, it winds up storing each T as a 
separate line in the appropriate mapping table, and has a column called 
INTEGER_IDX, which holds the position in the list. Then, upon reading, it 
automatically reads all relevant lines with an ORDER BY INTEGER_IDX, which 
results in the list retaining its order. In DN2 naming scheme, the column is 
called IDX, instead of INTEGER_IDX. If the user has run appropriate metatool 
upgrade scripts, it is highly likely that they have both columns, INTEGER_IDX 
and IDX.

Whenever they use JDO, such as with all writes, it will then use the IDX field, 
and when they do any sort of optimized reads, such as through directSQL, it 
will ORDER BY INTEGER_IDX.

An immediate danger is seen when we consider that the schema of a table is 
stored as a ListFieldSchema , and while IDX has 0,1,2,3,... , INTEGER_IDX 
will contain 0,0,0,0,... and thus, any attempt to describe the table or fetch 
schema for the table can come up mixed up in the table's native hashing order, 
rather than sorted by the index.

This can then result in schema ordering being different from the actual table. 
For eg:, if a user has a (a:int,b:string,c:string), a describe on this may 
return (c:string, a:int, b: string), and thus, queries which are inserting 
after selecting from another table can have ClassCastExceptions when trying to 
insert data in the wong order - this is how we discovered this bug. This 
problem, however, can be far worse, if there are no type problems - it is 
possible, for eg., that if a,bc were all strings, that that insert query would 
succeed but mix up the order, which then results in user table data being mixed 
up. This has the potential to be very bad.

We should write a tool to help convert metastores that use datanucleus2 to 
datanucleus1(more difficult, needs more one-time testing) or change directSql 
to support both(easier to code, but increases test-coverage matrix 
significantly and we should really then be testing against both schemes). But 
in the short term, we should disable directSql if we see that the 
identifierfactory is datanucleus2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34961: HIVE-10895: ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-16 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34961/#review88065
---



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 3513)
https://reviews.apache.org/r/34961/#comment140456

closeAll() call will close the result set. So if you return a List or 
Collection object (which has iterator interface)d directly from query.execute() 
returned, we can't iterate later through the list anymore. So we need to 
iterate through the list here.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 3543)
https://reviews.apache.org/r/34961/#comment140457

Same comments as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4431)
https://reviews.apache.org/r/34961/#comment140458

Same comments as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4460)
https://reviews.apache.org/r/34961/#comment140459

Same comments as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4483)
https://reviews.apache.org/r/34961/#comment140460

Same comments as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4528)
https://reviews.apache.org/r/34961/#comment140461

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4586)
https://reviews.apache.org/r/34961/#comment140462

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4621)
https://reviews.apache.org/r/34961/#comment140463

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4650)
https://reviews.apache.org/r/34961/#comment140464

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4679)
https://reviews.apache.org/r/34961/#comment140465

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4710)
https://reviews.apache.org/r/34961/#comment140466

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4771)
https://reviews.apache.org/r/34961/#comment140467

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4867)
https://reviews.apache.org/r/34961/#comment140468

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4903)
https://reviews.apache.org/r/34961/#comment140469

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4940)
https://reviews.apache.org/r/34961/#comment140470

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4981)
https://reviews.apache.org/r/34961/#comment140471

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5016)
https://reviews.apache.org/r/34961/#comment140472

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5050)
https://reviews.apache.org/r/34961/#comment140473

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5126)
https://reviews.apache.org/r/34961/#comment140474

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5158)
https://reviews.apache.org/r/34961/#comment140475

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5355)
https://reviews.apache.org/r/34961/#comment140478

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5386)
https://reviews.apache.org/r/34961/#comment140477

Same comment as above.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5436)
https://reviews.apache.org/r/34961/#comment140476

Same comment as above.


- Aihua Xu


On June 2, 2015, 11:07 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34961/
 ---
 
 (Updated June 2, 2015, 11:07 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-10895
 https://issues.apache.org/jira/browse/HIVE-10895
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-10895
 
 
 Diffs
 -
 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 fd61333 
 
 Diff: https://reviews.apache.org/r/34961/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta
 




[jira] [Created] (HIVE-11022) Support collecting lists in user defined order

2015-06-16 Thread Michael Haeusler (JIRA)
Michael Haeusler created HIVE-11022:
---

 Summary: Support collecting lists in user defined order
 Key: HIVE-11022
 URL: https://issues.apache.org/jira/browse/HIVE-11022
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Michael Haeusler


Hive currently supports aggregation of lists in order of input rows with the 
UDF collect_list. Unfortunately, the order is not well defined when map-side 
aggregations are used.

Hive could support collecting lists in user-defined order by providing a UDF
COLLECT_LIST_SORTED(valueColumn, sortColumn[, limit]), that would return a list 
of values sorted in a user defined order. An optional limit parameter can 
restrict this to the n first values within that order.

Especially in the limit case, this can be efficiently pre-aggregated and reduce 
the amount of data transferred to reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11024) Error inserting a date value via parameter marker (PreparedStatement.setDate)

2015-06-16 Thread Sergio Lob (JIRA)
Sergio Lob created HIVE-11024:
-

 Summary: Error inserting a date value via parameter marker 
(PreparedStatement.setDate)
 Key: HIVE-11024
 URL: https://issues.apache.org/jira/browse/HIVE-11024
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 0.14.0
 Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May 
10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Sergio Lob


Inserting a row with a Date parameter marker (PreparedStatement.setDate()) 
fails with ParseException:

Exception: org.apache.hive.service.cli.HiveSQLException: Error while compiling 
statement: FAILED: ParseException line 1:41 mismatched input '-' expecting ) 
near '1980' in statement
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: ParseException line 1:41 mismatched input '-' expecting ) near '1980' 
in statement
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254)
at org.apache.hive.jdbc.HiveStatement.executeUpdate(HiveStatement.java:4
06)
at org.apache.hive.jdbc.HivePreparedStatement.executeUpdate(HivePrepared
Statement.java:117)
at repro1.main(repro1.java:90)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling 
statement: FAILED: ParseException line 1:41 mismatched input '-' expecting ) 
near '1980' in statement
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operat
ion.java:314)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperati
on.java:102)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOpe
ration.java:171)



++
REPRO:
--

/*
 * It may be freely used, modified, and distributed with no restrictions.
 */
import java.sql.Connection;
import java.sql.DatabaseMetaData;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.sql.PreparedStatement;
import java.sql.ResultSetMetaData;
import java.io.Reader;

/**
 */
public class repro1
{
  /**
   * Main method.
   * 
   * @param args
   *no arguments required
   */
  public static void main(String [] args)
  {
Connection con = null;
Statement stmt = null;
ResultSet rst = null;

String drptab = DROP TABLE SDLJUNK;
String crttab = CREATE TABLE SDLJUNK(I INT, D DATE);
String instab = INSERT INTO TABLE SDLJUNK VALUES (1, ? );

try {

  System.out.println(=);
  System.out.println(Problem description:);
  System.out.println(After setting a value for a DATE parameter marker );
  System.out.println( with PreparedStatement.setDate(),); 
  System.out.println( an INSERT statement fails execution with error:  ); 
  System.out.println(  ); 
  System.out.println(  Error while compiling statement: FAILED: );
  System.out.println(ParseException line 1:78 mismatched input '-' );
  System.out.println( expecting ) near '1980' in statement); 
  System.out.println(=);
  System.out.println();
  // Create new instance of JDBC Driver and make connection.
  System.out.println(Registering Driver.);
  Class.forName(org.apache.hive.jdbc.HiveDriver);

  String url=jdbc:hive2://hwhive:1/R72D;
  System.out.println(Making a connection to: +url);
  con = DriverManager.getConnection(url, hive, hive); 
  System.out.println(Connection successful.\n);

  DatabaseMetaData dbmd = con.getMetaData();

  System.out.println(getDatabaseProductName() = 
+dbmd.getDatabaseProductName());
  System.out.println(getDatabaseProductVersion() = 
+dbmd.getDatabaseProductVersion());
  System.out.println(getDriverName() = +dbmd.getDriverName());
  System.out.println(getDriverVersion() = +dbmd.getDriverVersion());

  try {
 System.out.println(con.createStatement());
 stmt = con.createStatement();

 System.out.println(drptab);
 stmt.executeUpdate(drptab);
 }

  catch (Exception ex)
  { 
System.out.println(Exception:  + ex);
  }

  System.out.println(crttab);
  stmt.executeUpdate(crttab);

  System.out.println(preparing: +instab);
  PreparedStatement pstmt = con.prepareStatement(instab);


  System.out.println(calling setDate() for parameter marker);

  java.sql.Date dt = java.sql.Date.valueOf(1980-12-26);
  pstmt.setDate(1, dt);
//pstmt.setString(1, 1980-12-26);

  System.out.println(executing: +instab);
  pstmt.executeUpdate();


Hive-0.14 - Build # 986 - Still Failing

2015-06-16 Thread Apache Jenkins Server
Changes for Build #980

Changes for Build #981

Changes for Build #982

Changes for Build #983

Changes for Build #984

Changes for Build #985

Changes for Build #986



No tests ran.

The Apache Jenkins build system has built Hive-0.14 (build #986)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-0.14/986/ to view 
the results.

Re: Review Request 34897: CBO: Calcite Operator To Hive Operator (Calcite Return Path) Empty tabAlias in columnInfo which triggers PPD

2015-06-16 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34897/
---

(Updated June 16, 2015, 9:55 p.m.)


Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

in ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java, line 477, when 
aliases contains empty string  and key is an empty string  too, it assumes 
that aliases contains key. This will trigger incorrect PPD. To reproduce it, 
apply the HIVE-10455 and run cbo_subq_notin.q.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
 9c21238 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverterPostProc.java
 e7c8342 

Diff: https://reviews.apache.org/r/34897/diff/


Testing
---


Thanks,

pengcheng xiong



[jira] [Created] (HIVE-11026) Make vector_outer_join* test more robust

2015-06-16 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-11026:
---

 Summary: Make vector_outer_join* test more robust
 Key: HIVE-11026
 URL: https://issues.apache.org/jira/browse/HIVE-11026
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Ashutosh Chauhan


Different file sizes on different OSes result in different Data Size in explain 
output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11027) Hive on tez: Bucket map joins fail when hashcode goes negative

2015-06-16 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-11027:
-

 Summary: Hive on tez: Bucket map joins fail when hashcode goes 
negative
 Key: HIVE-11027
 URL: https://issues.apache.org/jira/browse/HIVE-11027
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.0
Reporter: Vikram Dixit K
Assignee: Prasanth Jayachandran


Seeing an issue when dynamic sort optimization is enabled while doing an insert 
into bucketed table. We seem to be flipping the negative sign on the hashcode 
instead of taking the complement of it for routing the data correctly. This 
results in correctness issues in bucket map joins in hive on tez when the hash 
code goes negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 35532: HIVE-11025 In windowing spec, when the datatype is decimal, it's comparing the value against NULL value incorrectly

2015-06-16 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35532/
---

Review request for hive.


Repository: hive-git


Description
---

HIVE-11025 In windowing spec, when the datatype is decimal, it's comparing the 
value against NULL value incorrectly


Diffs
-

  data/files/emp2.txt 650aff7f2c8003fb7c04dfa377c2b25d04f3ce88 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java 
32471f2dc864c38a2969909efa5b21508e27d7f8 
  ql/src/test/queries/clientpositive/windowing_windowspec3.q 
608a6cf45e3c1e0b928800dae0470e8acfd77734 
  ql/src/test/results/clientpositive/windowing_windowspec3.q.out 
42c042f2cf80f0a5a8269ad9eb9864d7e76525cc 

Diff: https://reviews.apache.org/r/35532/diff/


Testing
---


Thanks,

Aihua Xu



[jira] [Created] (HIVE-11025) In windowing spec, when the datatype is decimal, it's comparing the value against NULL value incorrectly

2015-06-16 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-11025:
---

 Summary: In windowing spec, when the datatype is decimal, it's 
comparing the value against NULL value incorrectly
 Key: HIVE-11025
 URL: https://issues.apache.org/jira/browse/HIVE-11025
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Given data and the following query,
{noformat}
deptno  empno  bonussalary
307698 NULL2850.0 
307900 NULL950.0 
307844 0   1500.0 

select avg(salary) over (partition by deptno order by bonus range 200 
preceding) from emp2;
{noformat}

It produces incorrect result for the row in which bonus=0
1900.0
1900.0
1766.7







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 35543: CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's mapInputToDP should depends on the last SEL

2015-06-16 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35543/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

In dynamic partitioning case, for example, we are going to have 
TS0-SEL1-SEL2-FS3. The dpCtx's mapInputToDP is populated by SEL1 rather than 
SEL2, which causes error in return path.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 58ee605 

Diff: https://reviews.apache.org/r/35543/diff/


Testing
---


Thanks,

pengcheng xiong



[jira] [Created] (HIVE-11029) hadoop.proxyuser.mapr.groups does not work to restrict the groups that can be impersonated

2015-06-16 Thread Na Yang (JIRA)
Na Yang created HIVE-11029:
--

 Summary: hadoop.proxyuser.mapr.groups does not work to restrict 
the groups that can be impersonated
 Key: HIVE-11029
 URL: https://issues.apache.org/jira/browse/HIVE-11029
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.2.0, 1.0.0, 0.14.0, 0.13.0
Reporter: Na Yang
Assignee: Na Yang


In the core-site.xml, the hadoop.proxyuser.user.groups specifies the user 
groups which can be impersonated by the HS2 user. However, this does not work 
properly in Hive. 

In my core-site.xml, I have the following configs:
property
  namehadoop.proxyuser.mapr.hosts/name
  value*/value
/property

property
  namehadoop.proxyuser.mapr.groups/name
  valueroot/value
/property

I would expect with this configuration that 'mapr' can impersonate only members 
of the Unix group 'root'. However if I submit a query as user 'jon' the query 
is running as user 'jon' even though 'mapr' should not be able to impersonate 
this user. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11028) Tez: table self join and join with another table fails with IndexOutOfBoundsException

2015-06-16 Thread Jason Dere (JIRA)
Jason Dere created HIVE-11028:
-

 Summary: Tez: table self join and join with another table fails 
with IndexOutOfBoundsException
 Key: HIVE-11028
 URL: https://issues.apache.org/jira/browse/HIVE-11028
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Jason Dere
Assignee: Jason Dere


{noformat}
create table tez_self_join1(id1 int, id2 string, id3 string);
insert into table tez_self_join1 values(1, 'aa','bb'), (2, 'ab','ab'), 
(3,'ba','ba');

create table tez_self_join2(id1 int);
insert into table tez_self_join2 values(1),(2),(3);

explain
select s.id2, s.id3
from
(
 select self1.id1, self1.id2, self1.id3
 from tez_self_join1 self1 join tez_self_join1 self2
 on self1.id2=self2.id3 ) s
join tez_self_join2
on s.id1=tez_self_join2.id1
where s.id2='ab';
{noformat}

fails with error:

{noformat}
2015-06-16 15:41:55,759 ERROR [main]: ql.Driver 
(SessionState.java:printError(979)) - FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
vertexName=Reducer 3, vertexId=vertex_1434494327112_0002_4_04, 
diagnostics=[Task failed, taskId=task_1434494327112_0002_4_04_00, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: Index: 0, 
Size: 0
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:109)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:313)
at 
org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:71)
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.initializeOp(CommonMergeJoinOperator.java:99)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
... 13 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11032) Enable more tests for grouping by skewed data [Spark Branch]

2015-06-16 Thread Rui Li (JIRA)
Rui Li created HIVE-11032:
-

 Summary: Enable more tests for grouping by skewed data [Spark 
Branch]
 Key: HIVE-11032
 URL: https://issues.apache.org/jira/browse/HIVE-11032
 Project: Hive
  Issue Type: Sub-task
Reporter: Rui Li
Priority: Minor


Not all of such tests are enabled, e.g. {{groupby1_map_skew.q}}. We can use 
this JIRA to track whether we need more of them.
Basically, we need to look at all tests with {{set 
hive.groupby.skewindata=true;}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11033) BloomFilter index is not honored by ORC reader

2015-06-16 Thread Allan Yan (JIRA)
Allan Yan created HIVE-11033:


 Summary: BloomFilter index is not honored by ORC reader
 Key: HIVE-11033
 URL: https://issues.apache.org/jira/browse/HIVE-11033
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Allan Yan


There is a bug in the org.apache.hadoop.hive.ql.io.orc.ReaderImpl class which 
caused the bloom filter index saved in the ORC file not being used. The reason 
is because the bloomFilterIndices variable defined in the SargApplier class 
superseded from its parent class.

Here is one way to fix it
{noformat}
18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java 
src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original
174d173
 bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
178c177
   sarg, options.getColumnNames(), strideRate, types, included.length, 
bloomFilterIndices);
---
   sarg, options.getColumnNames(), strideRate, types, included.length);
204a204
 bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
673c673
 ListOrcProto.Type types, int includedCount, 
OrcProto.BloomFilterIndex[] bloomFilterIndices) {
---
 ListOrcProto.Type types, int includedCount) {
677c677
   this.bloomFilterIndices = bloomFilterIndices;
---
   bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11034) Multiple join table producing different results

2015-06-16 Thread Srini Pindi (JIRA)
Srini Pindi created HIVE-11034:
--

 Summary: Multiple join table producing different results
 Key: HIVE-11034
 URL: https://issues.apache.org/jira/browse/HIVE-11034
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
 Environment: Linux 2.6.32-279.19.1.el6.x86_64

Reporter: Srini Pindi
Priority: Critical


Join between one main table with other tables with different join columns 
returns wrong results in hive. Changing the order of the joins between main 
table and other tables is producing different results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11031) ORC concatenation of old files can fail while merging column statistics

2015-06-16 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-11031:


 Summary: ORC concatenation of old files can fail while merging 
column statistics
 Key: HIVE-11031
 URL: https://issues.apache.org/jira/browse/HIVE-11031
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0, 1.0.0, 1.1.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Column statistics in ORC are optional protobuf fields. Old ORC files might not 
have statistics for newly added types like decimal, date, timestamp etc. But 
column statistics merging assumes column statistics exists for these types and 
invokes merge. For example, merging of TimestampColumnStatistics directly casts 
the received ColumnStatistics object without doing instanceof check. If the ORC 
file contains time stamp column statistics then this will work else it will 
throw ClassCastException.

Also, the file merge operator swallows the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11030) Enhance storage layer to create one delta file per write

2015-06-16 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-11030:
-

 Summary: Enhance storage layer to create one delta file per write
 Key: HIVE-11030
 URL: https://issues.apache.org/jira/browse/HIVE-11030
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


Currently each txn using ACID insert/update/delete will generate a delta 
directory like delta_100_101.  In order to support multi-statement 
transactions we must generate one delta per operation within the transaction so 
the deltas would be named like delta_100_101_0001, etc.

Support for MERGE (HIVE-10924) would need the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Hive PMC Members - Chao Sun and Gopal Vijayaraghavan

2015-06-16 Thread Damien Carol
Congrats !

Damien CAROL

   - tél : +33 (0)4 74 96 88 14
   - email : dca...@blitzbs.com

BLITZ BUSINESS SERVICE

2015-06-15 18:41 GMT+02:00 Gunther Hagleitner ghagleit...@hortonworks.com:

 Congrats Chao and Gopal!

 Cheers,
 Gunther.
 
 From: amareshwarisr . amareshw...@gmail.com
 Sent: Sunday, June 14, 2015 9:47 PM
 To: dev@hive.apache.org
 Subject: Re: [ANNOUNCE] New Hive PMC Members - Chao Sun and Gopal
 Vijayaraghavan

 Congratulations Chao and Gopal !

 Thanks,
 Amareshwari

 On Thu, Jun 11, 2015 at 2:50 AM, Carl Steinbach c...@apache.org wrote:

  I am pleased to announce that Chao Sun and Gopal Vijayaraghavan have been
  elected to the Hive Project Management Committee. Please join me in
  congratulating Chao and Gopal!
 
  Thanks.
 
  - Carl
 



Getting ready for 1.2.1

2015-06-16 Thread Sushanth Sowmyan
Hi folks,

It's been nearly a month since 1.2.0, and when I did that release, I said
I'd keep the branch open for any further non-db-changing, non-breaking
patches, and from the sheer number of patches registered on the status
page, that's been a good idea.

Now, I think it's time to start drawing that to a close to see an
stabilization update, and I would like to begin the process of rolling out
release candidates for 1.2.1. I would like to start rolling out an RC0 by
Wednesday night if no one objects.

For now, the rules on committing to branch-1.2 remain the same:
a) commit to branch-1  master first
b) add me as a watcher on that jira
c) add the bug to the release status wiki.

Once I start the release process, I will once again increase the bar for
commits as we did the last time. That said, this time, once we finish the
release for 1.2.1, the bar on further commits to branch-1.2 is intended to
remain at a higher level, so as to make sure we don't have too much of a
back porting hassle - we will soon try to limit our commits to branch-1 and
master only.

Cheers,
-Sushanth


[jira] [Created] (HIVE-11020) support partial scan for analyze command - Avro

2015-06-16 Thread Bing Li (JIRA)
Bing Li created HIVE-11020:
--

 Summary: support partial scan for analyze command - Avro
 Key: HIVE-11020
 URL: https://issues.apache.org/jira/browse/HIVE-11020
 Project: Hive
  Issue Type: Improvement
Reporter: Bing Li
Assignee: Bing Li


This is follow up on HIVE-3958.

We already have two similar Jiras
- support partial scan for analyze command - ORC
https://issues.apache.org/jira/browse/HIVE-4177

- [Parquet] Support Analyze Table with partial scan
https://issues.apache.org/jira/browse/HIVE-9491



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11019) Can't create an Avro table with uniontype column correctly

2015-06-16 Thread Bing Li (JIRA)
Bing Li created HIVE-11019:
--

 Summary: Can't create an Avro table with uniontype column correctly
 Key: HIVE-11019
 URL: https://issues.apache.org/jira/browse/HIVE-11019
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Bing Li


I tried the example in 
https://cwiki.apache.org/confluence/display/Hive/AvroSerDe
And found that it can't create an AVRO table correctly with uniontype

hive create table avro_union(union1 uniontypeFLOAT, BOOLEAN, STRING)STORED 
AS AVRO;
OK
Time taken: 0.083 seconds
hive describe avro_union;
OK
union1  uniontypevoid,float,boolean,string
Time taken: 0.058 seconds, Fetched: 1 row(s)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)