[GitHub] hive pull request #172: HIVE-15642: Replicate Insert Overwrites, Dynamic Par...

2017-05-03 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/172


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request #167: HIVE-16344: Test and support replication of exchange...

2017-05-03 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/167


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request #170: HIVE-16488: Support replicating into existing db if ...

2017-05-03 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/170


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Review Request 50787: Add a timezone-aware timestamp

2017-05-03 Thread Rui Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50787/
---

(Updated May 3, 2017, 6:34 a.m.)


Review request for hive.


Bugs: HIVE-14412
https://issues.apache.org/jira/browse/HIVE-14412


Repository: hive-git


Description
---

The 1st patch to add timezone-aware timestamp.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/type/TimestampTZ.java 
PRE-CREATION 
  common/src/test/org/apache/hadoop/hive/common/type/TestTimestampTZ.java 
PRE-CREATION 
  contrib/src/test/queries/clientnegative/serde_regex.q a676338 
  contrib/src/test/queries/clientpositive/serde_regex.q d75d607 
  contrib/src/test/results/clientnegative/serde_regex.q.out 58b1c02 
  contrib/src/test/results/clientpositive/serde_regex.q.out 2984293 
  hbase-handler/src/test/queries/positive/hbase_timestamp.q 0350afe 
  hbase-handler/src/test/results/positive/hbase_timestamp.q.out 3918121 
  itests/hive-blobstore/src/test/queries/clientpositive/orc_format_part.q 
358eccd 
  
itests/hive-blobstore/src/test/queries/clientpositive/orc_nonstd_partitions_loc.q
 c462538 
  itests/hive-blobstore/src/test/queries/clientpositive/rcfile_format_part.q 
c563d3a 
  
itests/hive-blobstore/src/test/queries/clientpositive/rcfile_nonstd_partitions_loc.q
 d17c281 
  itests/hive-blobstore/src/test/results/clientpositive/orc_format_part.q.out 
5d1319f 
  
itests/hive-blobstore/src/test/results/clientpositive/orc_nonstd_partitions_loc.q.out
 70e72f7 
  
itests/hive-blobstore/src/test/results/clientpositive/rcfile_format_part.q.out 
bed10ab 
  
itests/hive-blobstore/src/test/results/clientpositive/rcfile_nonstd_partitions_loc.q.out
 c6442f9 
  jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java ade1900 
  jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java 38918f0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 8dc5f2e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java f8b55da 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java 
01a652d 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/TypeConverter.java
 38308c9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 0cf9205 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 0721b92 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g d98a663 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 8598fae 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 8f8eab0 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java bda2050 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 7cdf2c3 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 5cacd59 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java 68d98f5 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java 5a31e61 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToTimestampTZ.java 
PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/parse/TestSQL11ReservedKeyWordsNegative.java
 0dc6b19 
  ql/src/test/queries/clientnegative/serde_regex.q c9cfc7d 
  ql/src/test/queries/clientnegative/serde_regex2.q a29bb9c 
  ql/src/test/queries/clientnegative/serde_regex3.q 4e91f06 
  ql/src/test/queries/clientpositive/create_like.q bd39731 
  ql/src/test/queries/clientpositive/join43.q 12c45a6 
  ql/src/test/queries/clientpositive/serde_regex.q e21c6e1 
  ql/src/test/queries/clientpositive/timestamptz.q PRE-CREATION 
  ql/src/test/queries/clientpositive/timestamptz_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/timestamptz_2.q PRE-CREATION 
  ql/src/test/results/clientnegative/serde_regex.q.out a1ec5ca 
  ql/src/test/results/clientnegative/serde_regex2.q.out 374675d 
  ql/src/test/results/clientnegative/serde_regex3.q.out dc0a0e2 
  ql/src/test/results/clientpositive/create_like.q.out ff2e752 
  ql/src/test/results/clientpositive/join43.q.out e8c7278 
  ql/src/test/results/clientpositive/serde_regex.q.out 7bebb0c 
  ql/src/test/results/clientpositive/timestamptz.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/timestamptz_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/timestamptz_2.q.out PRE-CREATION 
  serde/if/serde.thrift 1d40d5a 
  serde/src/gen/thrift/gen-cpp/serde_constants.h 8785bd2 
  serde/src/gen/thrift/gen-cpp/serde_constants.cpp 907acf2 
  
serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java
 2578d3e 
  serde/src/gen/thrift/gen-php/org/apache/hadoop/hive/serde/Types.php ea2dbbe 
  serde/src/gen/thrift/gen-py/org_apache_hadoop_hive_serde/constants.py e3b24eb 
  serde/src/gen/thrift/gen-rb/serde_constants.rb 15efaea 
  serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java 5ecfbca 
  
serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java
 89e15c3 
  

Re: [VOTE] Apache Hive 2.3.0 Release Candidate 0

2017-05-03 Thread Sergio Pena
Thanks Rui.

Pengcheng, the patch is reverted, you may continue with the RC1.

On Tue, May 2, 2017 at 11:02 PM, Rui Li  wrote:

> The patch has been reverted in master and branch-2.3
>
> On Wed, May 3, 2017 at 3:01 AM, Sergio Pena 
> wrote:
>
> > Hi Pengcheng,
> >
> > There is a request from the HDFS team to revert the patch committed on
> > HIVE-16047 from
> > our code because it might cause problems when future Hadoop versions are
> > released due to being a
> > private API on Hadoop. This API method signature has been changed between
> > releases, and
> > we don't want to have additional shims to support future Hadoop versions
> > just for this method.
> >
> > I'd like to revert it from 2.3.0 release before doing the release. It is
> > marked as being fixed on 2.2 but it is not cherry-picked on branch-2.2
> but
> > branch-2.3.
> >
> > Do you agree?
> >
> > - Sergio
> >
> > On Fri, Apr 28, 2017 at 1:40 PM, Pengcheng Xiong 
> > wrote:
> >
> > > Withdraw the VOTE on candidate 0. Will propose candidate 1 soon.
> Thanks.
> > >
> > > On Thu, Apr 27, 2017 at 8:10 PM, Owen O'Malley  >
> > > wrote:
> > >
> > > > -1 you need a release of storage-API first.
> > > >
> > > > .. Owen
> > > >
> > > > > On Apr 27, 2017, at 17:43, Pengcheng Xiong 
> > wrote:
> > > > >
> > > > > Apache Hive 2.3.0 Release Candidate 0 is available here:
> > > > > http://home.apache.org/~pxiong/apache-hive-2.3.0-rc0/
> > > > >
> > > > >
> > > > > Maven artifacts are available here:
> > > > > https://repository.apache.org/content/repositories/
> > orgapachehive-1073/
> > > > >
> > > > >
> > > > > Source tag for RC0 is at:
> > > > >
> > > > > https://github.com/apache/hive/releases/tag/release-2.3.0-rc0
> > > > >
> > > > > Voting will conclude in 72 hours.
> > > > >
> > > > > Hive PMC Members: Please test and vote.
> > > > >
> > > > > Thanks.
> > > >
> > >
> >
>
>
>
> --
> Best regards!
> Rui Li
> Cell: (+86) 13564950210
>


[jira] [Created] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-16575:
--

 Summary: Support for 'UNIQUE' and 'NOT NULL' constraints
 Key: HIVE-16575
 URL: https://issues.apache.org/jira/browse/HIVE-16575
 Project: Hive
  Issue Type: New Feature
  Components: CBO, Logical Optimizer, Parser
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Follow-up on HIVE-13076.

This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
create a table / alter a table 
(https://www.postgresql.org/docs/9.6/static/sql-createtable.html).

As with PK and FK constraints, currently we do not enforce them; thus, the 
constraints need to use the DISABLE option, but they will be stored and can be 
enabled for rewriting/optimization using RELY.

This patch also adds support for inlining the constraints next to the column 
type definition, i.e., 'column constraints'.

Some examples of the extension to the syntax included in the patch:
{code:sql}
CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
KEY (x) REFERENCES table2(a) DISABLE, 
CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
STRING);
CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
STRING);
CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
RELY, b STRING);
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] hive pull request #175: HIVE-16575: Support for 'UNIQUE' and 'NOT NULL' cons...

2017-05-03 Thread jcamachor
GitHub user jcamachor opened a pull request:

https://github.com/apache/hive/pull/175

HIVE-16575: Support for 'UNIQUE' and 'NOT NULL' constraints



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jcamachor/hive not_null

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #175


commit 686ed38fdcf5b67f9748631795dbeaf2a8a2b692
Author: Jesus Camacho Rodriguez 
Date:   2017-05-03T09:09:49Z

HIVE-16575: Support for 'UNIQUE' and 'NOT NULL' constraints




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Review Request 58501: HIVE-16469: Parquet timestamp table property is not always taken into account

2017-05-03 Thread Barna Zsombor Klara

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58501/
---

(Updated May 3, 2017, 12:59 p.m.)


Review request for hive, Sergio Pena and Zoltan Ivanfi.


Changes
---

Updated based on comments.


Bugs: HIVE-16469
https://issues.apache.org/jira/browse/HIVE-16469


Repository: hive-git


Description
---

HIVE-16469: Parquet timestamp table property is not always taken into account


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
757b7fc0eaa39c956014aa446ab1b07fc4abf8d3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 
13750cdc34711d22f2adf2f483a6773ad05fb8d2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 
1bd4db7805689ae1f91921ffbb5ff7da59f4bf60 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 
f4fadbb61bf45f62945700284c0b050f0984b696 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
2954601ce5bb25905cdb29ca0ca4551c2ca12b95 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
6413c5add6db2e8c9298285b15dba33ee74379a8 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java 
b339cc4347eea143dca2f6d98f9aaafdc427 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 
dbd6fb3d0bc8c753abf86e99b52377617f248b5a 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java 
c81499a91c84af3ba33f335506c1c44e7085f13d 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRowGroupFilter.java 
bf363f32a3ac0a4d790e2925d802c6e210adfb4b 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/VectorizedColumnReaderTestBase.java
 f2d79cf9d215e9a6e2a5e88cfc78378be860fd1f 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java
 1e10dbf18742524982606f1e6c6d447d683b2dc3 
  ql/src/test/queries/clientnegative/parquet_int96_alter_invalid_timezone.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/parquet_int96_create_invalid_timezone.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 
6eadd1b0a3313cbba7a798890b802baae302749e 
  ql/src/test/results/clientnegative/parquet_int96_alter_invalid_timezone.q.out 
PRE-CREATION 
  
ql/src/test/results/clientnegative/parquet_int96_create_invalid_timezone.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_int96_timestamp.q.out 
b9a3664458a83f1856e4bc59eba5d56665df61cc 
  ql/src/test/results/clientpositive/spark/parquet_int96_timestamp.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/58501/diff/4/

Changes: https://reviews.apache.org/r/58501/diff/3-4/


Testing
---

Added qtests for the following cases:
- order by clause
- selfjoin
- calling UDFs with the timestamp values
- where clause with a constant cast as timestamp
- test for HoS
- implicit and explicit timestamp conversions in insert clause

Tested manually but no qtests:
- join between 3 tables all parquet but with different/no timezone property
- subselect in from/where clauses
- exists / union / no exists


Thanks,

Barna Zsombor Klara



Re: Review Request 58501: HIVE-16469: Parquet timestamp table property is not always taken into account

2017-05-03 Thread Barna Zsombor Klara


> On May 2, 2017, 5:27 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java
> > Line 72 (original), 72 (patched)
> > 
> >
> > How does this work? I don't understand this change.

The user.timezone system property is used to set the default timezone of the 
JVM. If this is set on the HS2 instance then we need to propagate it to the 
child VM spawned by a local task or timestamps read by the local task will be 
incorrect.


> On May 2, 2017, 5:27 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java
> > Line 181 (original), 181 (patched)
> > 
> >
> > Is this compatible with old parquet tables? if the property is not set, 
> > then the validateTimeZonemight fail, right? If so, do we want to fail 
> > reading tables that do not have a property set?
> > 
> > Something else to consider, if a user sets a timezone improperly in a 
> > different tool or something  happened that we got an invalid timezone, 
> > then do we want to fail when reading those files? Just  wondering this 
> > scenario, no need to fix it right away.

At this point the timezone property had to be set by 
ParquetTableUtils#setParquetTimeZoneIfAbsent either from the table properties 
or using the default value TimeZone#getDefault. The core problem is that I 
found it very difficult to make sure that  execution path will check the 
table property.
- The FetchOperator works when we have a local task, but the 
MapRedParquetInputFormat does not (MapWork is null). 
- The FetchOperator will not work with a complex query or an order by clause, 
but the InputFormat should work in this case. 
- For statistics gathering only the StatNoJobTask is executed.
I wanted to make sure that if we have an execution path I forgot about, then we 
should rather fail than to read incorrect timestamp values silently.
Similarly in my opinion if the timezone value is incorrect (because it was set 
by another tool) then we should fail instead of reading illadjusted values.


> On May 2, 2017, 5:27 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java
> > Lines 35 (patched)
> > 
> >
> > Why is Map used instead of Map? Aren't all table 
> > properties key, value string pairs?
> > 
> > Also, the ensureTablePropertySet() name seems not related to what we 
> > want to do. I thought it was going to throw an exception if the property 
> > was not set, but it is setting the value on the JobConf. Should we use a 
> > different name, such as setParquetTimeZoneIfNotSet(),  
> > setParquetTimeZoneIfAbsent() or something like that helps us understand 
> > quickly without looking at the javadoc.

We are calling this method with Properties objects (i.e. from the 
FetchOperator) and using Map objects (i.e. from the 
StatsNoJobTask) and the common ancestor for these two is the Map. While it 
is true that the table properties can only be Strings so the Properties should 
only contain String pairs I wanted to avoid the explicit cast.


- Barna Zsombor


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58501/#review173610
---


On May 3, 2017, 12:59 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58501/
> ---
> 
> (Updated May 3, 2017, 12:59 p.m.)
> 
> 
> Review request for hive, Sergio Pena and Zoltan Ivanfi.
> 
> 
> Bugs: HIVE-16469
> https://issues.apache.org/jira/browse/HIVE-16469
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16469: Parquet timestamp table property is not always taken into account
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 757b7fc0eaa39c956014aa446ab1b07fc4abf8d3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 
> 13750cdc34711d22f2adf2f483a6773ad05fb8d2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
> 9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 
> 1bd4db7805689ae1f91921ffbb5ff7da59f4bf60 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
>  f4fadbb61bf45f62945700284c0b050f0984b696 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 2954601ce5bb25905cdb29ca0ca4551c2ca12b95 
>   
> 

Re: Review Request 58865: HIVE-16552: Limit the number of tasks a Spark job may contain

2017-05-03 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58865/
---

(Updated May 3, 2017, 6:14 p.m.)


Review request for hive.


Bugs: HIVE-16552
https://issues.apache.org/jira/browse/HIVE-16552


Repository: hive-git


Description
---

See JIRA description


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 84398c6 
  itests/src/test/resources/testconfiguration.properties 753f3a9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 32a7730 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/RemoteSparkJobMonitor.java
 dd73f3e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobMonitor.java 
0b224f2 
  ql/src/test/queries/clientnegative/spark_job_max_tasks.q PRE-CREATION 
  ql/src/test/results/clientnegative/spark/spark_job_max_tasks.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/58865/diff/4/

Changes: https://reviews.apache.org/r/58865/diff/3-4/


Testing
---

Test locally


Thanks,

Xuefu Zhang



Re: Review Request 58501: HIVE-16469: Parquet timestamp table property is not always taken into account

2017-05-03 Thread Vihang Karajgaonkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58501/#review173747
---




ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
Lines 115-120 (patched)


Should logs here be warning?


- Vihang Karajgaonkar


On May 3, 2017, 12:59 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58501/
> ---
> 
> (Updated May 3, 2017, 12:59 p.m.)
> 
> 
> Review request for hive, Sergio Pena and Zoltan Ivanfi.
> 
> 
> Bugs: HIVE-16469
> https://issues.apache.org/jira/browse/HIVE-16469
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16469: Parquet timestamp table property is not always taken into account
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 757b7fc0eaa39c956014aa446ab1b07fc4abf8d3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 
> 13750cdc34711d22f2adf2f483a6773ad05fb8d2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
> 9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 
> 1bd4db7805689ae1f91921ffbb5ff7da59f4bf60 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
>  f4fadbb61bf45f62945700284c0b050f0984b696 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 2954601ce5bb25905cdb29ca0ca4551c2ca12b95 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
> 6413c5add6db2e8c9298285b15dba33ee74379a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java 
> b339cc4347eea143dca2f6d98f9aaafdc427 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 
> dbd6fb3d0bc8c753abf86e99b52377617f248b5a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java
>  c81499a91c84af3ba33f335506c1c44e7085f13d 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRowGroupFilter.java
>  bf363f32a3ac0a4d790e2925d802c6e210adfb4b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/VectorizedColumnReaderTestBase.java
>  f2d79cf9d215e9a6e2a5e88cfc78378be860fd1f 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java
>  1e10dbf18742524982606f1e6c6d447d683b2dc3 
>   ql/src/test/queries/clientnegative/parquet_int96_alter_invalid_timezone.q 
> PRE-CREATION 
>   ql/src/test/queries/clientnegative/parquet_int96_create_invalid_timezone.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 
> 6eadd1b0a3313cbba7a798890b802baae302749e 
>   
> ql/src/test/results/clientnegative/parquet_int96_alter_invalid_timezone.q.out 
> PRE-CREATION 
>   
> ql/src/test/results/clientnegative/parquet_int96_create_invalid_timezone.q.out
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_int96_timestamp.q.out 
> b9a3664458a83f1856e4bc59eba5d56665df61cc 
>   ql/src/test/results/clientpositive/spark/parquet_int96_timestamp.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/58501/diff/4/
> 
> 
> Testing
> ---
> 
> Added qtests for the following cases:
> - order by clause
> - selfjoin
> - calling UDFs with the timestamp values
> - where clause with a constant cast as timestamp
> - test for HoS
> - implicit and explicit timestamp conversions in insert clause
> 
> Tested manually but no qtests:
> - join between 3 tables all parquet but with different/no timezone property
> - subselect in from/where clauses
> - exists / union / no exists
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>



[jira] [Created] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-16576:
---

 Summary: Fix encoding of intervals when fetching select query 
candidates from druid
 Key: HIVE-16576
 URL: https://issues.apache.org/jira/browse/HIVE-16576
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Debug logs on HIVE side - 
{code}
2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
client.NettyHttpClient: [GET 
http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
 Got response: 500 Server Error
{code}

Druid exception stack trace - 
{code}
2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
org.eclipse.jetty.servlet.ServletHandler - 
/druid/v2/datasources/cmv_basetable_druid/candidates
java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
05:53:20"
at 
org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
 ~[joda-time-2.8.2.jar:2.8.2]
at 
org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.(Interval.java:193) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.parse(Interval.java:69) 
~[joda-time-2.8.2.jar:2.8.2]
at 
io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320) 
~[classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_92]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_92]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_92]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
{code}

Note that intervals being sent as part of the HTTP request URL are not encoded 
properly. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Review Request 58973: HIVE-16578

2017-05-03 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/
---

Review request for hive and Jason Dere.


Repository: hive-git


Description
---

Semijoin Hints should use column name, if provided for partition key check.
Involves some code refactoring.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 b8c01020b7 


Diff: https://reviews.apache.org/r/58973/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 58934: HIVE-16568: Support complex types in external LLAP InputFormat

2017-05-03 Thread Jason Dere


> On May 3, 2017, 5:34 a.m., Prasanth_J wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/LlapRowRecordReader.java
> > Lines 154 (patched)
> > 
> >
> > IIRC, there are utilities already to do this in ObjectInspectorUtils.. 
> > copyToXXX() methods.. can that be reused?

I had not seen the methods in ObjectInspectorUtils, yeah we might be able to 
use this.


> On May 3, 2017, 5:34 a.m., Prasanth_J wrote:
> > llap-common/src/java/org/apache/hadoop/hive/llap/TypeDesc.java
> > Lines 154 (patched)
> > 
> >
> > This also looks repetitive. TypeInfoUtils already has something like 
> > this I guess. We need to make sure TypeInfo parser can parse the string 
> > generated by this method. It's easier to reuse TypeInfoUtils or have a 
> > TypeDesc converted to TypeInfo.
> > 
> > My point there is duplicacy in
> > TypeInfo
> > TypeDesc
> > TypeDescriptor (ORC has this)
> > 
> > Wondering if TypeInfo or TypeDescriptor from ORC can be reused here. 
> > Thoughts?

I had orginally created TypeDesc to try to avoid extra dependencies on what 
might be considered internal Hive types, but it is true that we are still the 
serde lib and TypeInfo is available from there. At this point though the 
TypeDesc has already been created, I'm not totally sure if I want to now go and 
remove this. If you feel strongly about this let me know.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58934/#review173697
---


On May 2, 2017, 9:52 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58934/
> ---
> 
> (Updated May 2, 2017, 9:52 p.m.)
> 
> 
> Review request for hive, Gunther Hagleitner, Prasanth_J, and Siddharth Seth.
> 
> 
> Bugs: HIVE-16568
> https://issues.apache.org/jira/browse/HIVE-16568
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - Support list/map/struct types in the LLAPRowInputFormat Schema/TypeDesc
> - Support list/map/struct types in the LLAPRowInputFormat Row. Changes in the 
> Row getters/setters needed (no longer using Writable).
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlap.java 
> de47412 
>   llap-client/src/java/org/apache/hadoop/hive/llap/LlapRowRecordReader.java 
> ee92f3e 
>   llap-common/src/java/org/apache/hadoop/hive/llap/Row.java a84fadc 
>   llap-common/src/java/org/apache/hadoop/hive/llap/TypeDesc.java dda5928 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
> 9ddbd7e 
> 
> 
> Diff: https://reviews.apache.org/r/58934/diff/1/
> 
> 
> Testing
> ---
> 
> Added test to TestJdbcWithMiniLlap
> 
> 
> Thanks,
> 
> Jason Dere
> 
>



CFP for Dataworks Summit Sydney

2017-05-03 Thread Alan Gates
Resend, sorry if this shows up twice.

The Australia/Pacific version of Dataworks Summit is in Sydney this year, 
September 20-21.   This is a great place to talk about work you are doing in 
Apache Hive or how you are using Hive.  Information on submitting an abstract 
is at https://dataworkssummit.com/sydney-2017/abstracts/submit-abstract/

Tracks:
Apache Hadoop
Apache Spark and Data Science
Cloud and Applications
Data Processing and Warehousing
Enterprise Adoption
IoT and Streaming
Operations, Governance and Security

Deadline: Friday, May 26th, 2017.

Alan.



[jira] [Created] (HIVE-16577) Syntax error in the metastore init scripts for mssql

2017-05-03 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-16577:
--

 Summary: Syntax error in the metastore init scripts for mssql
 Key: HIVE-16577
 URL: https://issues.apache.org/jira/browse/HIVE-16577
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 2.2.0, 2.3.0, 3.0.0, 2.4.0
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar
Priority: Blocker


HIVE-10562 introduced a new column to {{NOTIFICATION_LOG}} table. The mssql 
init scripts which were modified have a syntax error and they fail to 
initialize metastore schema from 2.2.0 onwards.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-16578:
-

 Summary: Semijoin Hints should use column name, if provided for 
partition key check
 Key: HIVE-16578
 URL: https://issues.apache.org/jira/browse/HIVE-16578
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


Current logic does not verify the column name provided in the hint against the 
column on which the runtime filtering branch will originate from.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58934: HIVE-16568: Support complex types in external LLAP InputFormat

2017-05-03 Thread j . prasanth . j


> On May 3, 2017, 5:34 a.m., Prasanth_J wrote:
> > llap-common/src/java/org/apache/hadoop/hive/llap/TypeDesc.java
> > Lines 154 (patched)
> > 
> >
> > This also looks repetitive. TypeInfoUtils already has something like 
> > this I guess. We need to make sure TypeInfo parser can parse the string 
> > generated by this method. It's easier to reuse TypeInfoUtils or have a 
> > TypeDesc converted to TypeInfo.
> > 
> > My point there is duplicacy in
> > TypeInfo
> > TypeDesc
> > TypeDescriptor (ORC has this)
> > 
> > Wondering if TypeInfo or TypeDescriptor from ORC can be reused here. 
> > Thoughts?
> 
> Jason Dere wrote:
> I had orginally created TypeDesc to try to avoid extra dependencies on 
> what might be considered internal Hive types, but it is true that we are 
> still the serde lib and TypeInfo is available from there. At this point 
> though the TypeDesc has already been created, I'm not totally sure if I want 
> to now go and remove this. If you feel strongly about this let me know.

May be this can be done in a follow up if it involves a lot of surgery.
If removing hive dependency is desired or required, then there are 2 
possibilities (Make TypeDescriptor in ORC a separate module or make TypeInfo in 
hive as separate module). But at the current state it looks like there is 
dependecy with hive serde and typeinfo, so it might be easier to reuse code.
This way there won't be problem with maintaining compatibility with hive types 
when new one gets added.


- Prasanth_J


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58934/#review173697
---


On May 2, 2017, 9:52 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58934/
> ---
> 
> (Updated May 2, 2017, 9:52 p.m.)
> 
> 
> Review request for hive, Gunther Hagleitner, Prasanth_J, and Siddharth Seth.
> 
> 
> Bugs: HIVE-16568
> https://issues.apache.org/jira/browse/HIVE-16568
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - Support list/map/struct types in the LLAPRowInputFormat Schema/TypeDesc
> - Support list/map/struct types in the LLAPRowInputFormat Row. Changes in the 
> Row getters/setters needed (no longer using Writable).
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlap.java 
> de47412 
>   llap-client/src/java/org/apache/hadoop/hive/llap/LlapRowRecordReader.java 
> ee92f3e 
>   llap-common/src/java/org/apache/hadoop/hive/llap/Row.java a84fadc 
>   llap-common/src/java/org/apache/hadoop/hive/llap/TypeDesc.java dda5928 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
> 9ddbd7e 
> 
> 
> Diff: https://reviews.apache.org/r/58934/diff/1/
> 
> 
> Testing
> ---
> 
> Added test to TestJdbcWithMiniLlap
> 
> 
> Thanks,
> 
> Jason Dere
> 
>



Re: [VOTE] Apache Hive 2.3.0 Release Candidate 0

2017-05-03 Thread Vihang Karajgaonkar
I think we need to fix HIVE-16577 before releasing 2.3.0. The metastore
schema initialization script for mssql is broken since 2.2.0.

Also, I noticed that JIRA shows 2.2.0 as unreleased. Does anyone know how
to fix it?

On Wed, May 3, 2017 at 8:48 AM, Sergio Pena 
wrote:

> Thanks Rui.
>
> Pengcheng, the patch is reverted, you may continue with the RC1.
>
> On Tue, May 2, 2017 at 11:02 PM, Rui Li  wrote:
>
> > The patch has been reverted in master and branch-2.3
> >
> > On Wed, May 3, 2017 at 3:01 AM, Sergio Pena 
> > wrote:
> >
> > > Hi Pengcheng,
> > >
> > > There is a request from the HDFS team to revert the patch committed on
> > > HIVE-16047 from
> > > our code because it might cause problems when future Hadoop versions
> are
> > > released due to being a
> > > private API on Hadoop. This API method signature has been changed
> between
> > > releases, and
> > > we don't want to have additional shims to support future Hadoop
> versions
> > > just for this method.
> > >
> > > I'd like to revert it from 2.3.0 release before doing the release. It
> is
> > > marked as being fixed on 2.2 but it is not cherry-picked on branch-2.2
> > but
> > > branch-2.3.
> > >
> > > Do you agree?
> > >
> > > - Sergio
> > >
> > > On Fri, Apr 28, 2017 at 1:40 PM, Pengcheng Xiong 
> > > wrote:
> > >
> > > > Withdraw the VOTE on candidate 0. Will propose candidate 1 soon.
> > Thanks.
> > > >
> > > > On Thu, Apr 27, 2017 at 8:10 PM, Owen O'Malley <
> owen.omal...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > -1 you need a release of storage-API first.
> > > > >
> > > > > .. Owen
> > > > >
> > > > > > On Apr 27, 2017, at 17:43, Pengcheng Xiong 
> > > wrote:
> > > > > >
> > > > > > Apache Hive 2.3.0 Release Candidate 0 is available here:
> > > > > > http://home.apache.org/~pxiong/apache-hive-2.3.0-rc0/
> > > > > >
> > > > > >
> > > > > > Maven artifacts are available here:
> > > > > > https://repository.apache.org/content/repositories/
> > > orgapachehive-1073/
> > > > > >
> > > > > >
> > > > > > Source tag for RC0 is at:
> > > > > >
> > > > > > https://github.com/apache/hive/releases/tag/release-2.3.0-rc0
> > > > > >
> > > > > > Voting will conclude in 72 hours.
> > > > > >
> > > > > > Hive PMC Members: Please test and vote.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Best regards!
> > Rui Li
> > Cell: (+86) 13564950210
> >
>


CFP for Dataworks Summit Sydney

2017-05-03 Thread Alan Gates
The Australia/Pacific version of Dataworks Summit is in Sydney this year, 
September 20-21.   This is a great place to talk about work you are doing in 
Apache Hive or how you are using Hive.  Information on submitting an abstract 
is at https://dataworkssummit.com/sydney-2017/abstracts/submit-abstract/

Tracks:
Apache Hadoop
Apache Spark and Data Science
Cloud and Applications
Data Processing and Warehousing
Enterprise Adoption
IoT and Streaming
Operations, Governance and Security

Deadline: Friday, May 26th, 2017.

Alan.



Re: Review Request 58973: HIVE-16578

2017-05-03 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/#review173818
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 225 (patched)


Can't totally tell the nesting here, but it seems like if getColumnName() 
returns false this will not do semijoin reduction for this column, regardless 
of if there is a hint or not .. is that intended?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 235 (patched)


Remove your name from the log line.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 306 (patched)


Not sure if you can do this and truly say this is the column for this 
expression - the expression could be a function, such as replace(col1, col2).

How about ExprNodeDescUtils.getColumnExpr()?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 320 (patched)


same here, about ExprNodeDescUtils.getColumnExpr()


- Jason Dere


On May 3, 2017, 8:12 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58973/
> ---
> 
> (Updated May 3, 2017, 8:12 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Semijoin Hints should use column name, if provided for partition key check.
> Involves some code refactoring.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  b8c01020b7 
> 
> 
> Diff: https://reviews.apache.org/r/58973/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 58973: HIVE-16578

2017-05-03 Thread Deepak Jaiswal


> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 225 (patched)
> > 
> >
> > Can't totally tell the nesting here, but it seems like if 
> > getColumnName() returns false this will not do semijoin reduction for this 
> > column, regardless of if there is a hint or not .. is that intended?

Yes. From the old logic, in generateSemiJoinOperator(), we would return if we 
fail to obtain the appropriate ExprNodeColumnDesc. Instead of returning there, 
now it never calls the function.
Basically, the logic has been pushed down.


> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 235 (patched)
> > 
> >
> > Remove your name from the log line.

Thanks! Forgot after debugging the code.


> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 306 (patched)
> > 
> >
> > Not sure if you can do this and truly say this is the column for this 
> > expression - the expression could be a function, such as replace(col1, 
> > col2).
> > 
> > How about ExprNodeDescUtils.getColumnExpr()?

Thanks for telling me about the method, will see if it works in this case.


> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 320 (patched)
> > 
> >
> > same here, about ExprNodeDescUtils.getColumnExpr()

ditto


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/#review173818
---


On May 3, 2017, 8:12 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58973/
> ---
> 
> (Updated May 3, 2017, 8:12 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Semijoin Hints should use column name, if provided for partition key check.
> Involves some code refactoring.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  b8c01020b7 
> 
> 
> Diff: https://reviews.apache.org/r/58973/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



[jira] [Created] (HIVE-16579) CachedStore: improvements to partition col stats caching

2017-05-03 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-16579:
---

 Summary: CachedStore: improvements to partition col stats caching
 Key: HIVE-16579
 URL: https://issues.apache.org/jira/browse/HIVE-16579
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 50787: Add a timezone-aware timestamp

2017-05-03 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50787/#review173833
---




ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
Lines 132 (patched)


I think Identifier["timestamptz"] and Identifier["zone"] may be sufficient. 
It is not necessary to make them as key words and then add them back as 
identifiers. You can have a try and see if it works. Thanks..


- pengcheng xiong


On May 3, 2017, 6:34 a.m., Rui Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50787/
> ---
> 
> (Updated May 3, 2017, 6:34 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-14412
> https://issues.apache.org/jira/browse/HIVE-14412
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The 1st patch to add timezone-aware timestamp.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/TimestampTZ.java 
> PRE-CREATION 
>   common/src/test/org/apache/hadoop/hive/common/type/TestTimestampTZ.java 
> PRE-CREATION 
>   contrib/src/test/queries/clientnegative/serde_regex.q a676338 
>   contrib/src/test/queries/clientpositive/serde_regex.q d75d607 
>   contrib/src/test/results/clientnegative/serde_regex.q.out 58b1c02 
>   contrib/src/test/results/clientpositive/serde_regex.q.out 2984293 
>   hbase-handler/src/test/queries/positive/hbase_timestamp.q 0350afe 
>   hbase-handler/src/test/results/positive/hbase_timestamp.q.out 3918121 
>   itests/hive-blobstore/src/test/queries/clientpositive/orc_format_part.q 
> 358eccd 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/orc_nonstd_partitions_loc.q
>  c462538 
>   itests/hive-blobstore/src/test/queries/clientpositive/rcfile_format_part.q 
> c563d3a 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/rcfile_nonstd_partitions_loc.q
>  d17c281 
>   itests/hive-blobstore/src/test/results/clientpositive/orc_format_part.q.out 
> 5d1319f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/orc_nonstd_partitions_loc.q.out
>  70e72f7 
>   
> itests/hive-blobstore/src/test/results/clientpositive/rcfile_format_part.q.out
>  bed10ab 
>   
> itests/hive-blobstore/src/test/results/clientpositive/rcfile_nonstd_partitions_loc.q.out
>  c6442f9 
>   jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java ade1900 
>   jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java 38918f0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 8dc5f2e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java f8b55da 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java 
> 01a652d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/TypeConverter.java
>  38308c9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 0cf9205 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 0721b92 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g d98a663 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 8598fae 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> 8f8eab0 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java bda2050 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 7cdf2c3 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 5cacd59 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java 68d98f5 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java 
> 5a31e61 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToTimestampTZ.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/parse/TestSQL11ReservedKeyWordsNegative.java
>  0dc6b19 
>   ql/src/test/queries/clientnegative/serde_regex.q c9cfc7d 
>   ql/src/test/queries/clientnegative/serde_regex2.q a29bb9c 
>   ql/src/test/queries/clientnegative/serde_regex3.q 4e91f06 
>   ql/src/test/queries/clientpositive/create_like.q bd39731 
>   ql/src/test/queries/clientpositive/join43.q 12c45a6 
>   ql/src/test/queries/clientpositive/serde_regex.q e21c6e1 
>   ql/src/test/queries/clientpositive/timestamptz.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/timestamptz_1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/timestamptz_2.q PRE-CREATION 
>   ql/src/test/results/clientnegative/serde_regex.q.out a1ec5ca 
>   ql/src/test/results/clientnegative/serde_regex2.q.out 374675d 
>   ql/src/test/results/clientnegative/serde_regex3.q.out dc0a0e2 
>   ql/src/test/results/clientpositive/create_like.q.out ff2e752 
>   ql/src/test/results/clientpositive/join43.q.out e8c7278 
>   ql/src/test/results/clientpositive/serde_regex.q.out 7bebb0c 
>   

[GitHub] hive pull request #176: HIVE-16530: Improve execution logs for REPL commands

2017-05-03 Thread sankarh
GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/176

HIVE-16530: Improve execution logs for REPL commands



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-16530

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/176.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #176


commit d9a5e0aaf317cfdd60d1f2b62840e3f03d0fd1ac
Author: Sankar Hariappan 
Date:   2017-05-03T21:54:41Z

HIVE-16530: Improve execution logs for REPL commands




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (HIVE-16580) CachedStore: Cache column stats for unpartitioned tables

2017-05-03 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-16580:
---

 Summary: CachedStore: Cache column stats for unpartitioned tables
 Key: HIVE-16580
 URL: https://issues.apache.org/jira/browse/HIVE-16580
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58973: HIVE-16578

2017-05-03 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/
---

(Updated May 4, 2017, 1:19 a.m.)


Review request for hive and Jason Dere.


Changes
---

Worked on the review comments. Added some more tests.


Repository: hive-git


Description
---

Semijoin Hints should use column name, if provided for partition key check.
Involves some code refactoring.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 b8c01020b7 
  ql/src/test/queries/clientpositive/semijoin_hint.q a3cd1d664d 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 
1d1f86bfaa 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 38ef1c 


Diff: https://reviews.apache.org/r/58973/diff/2/

Changes: https://reviews.apache.org/r/58973/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-16581) improve upon HIVE-16523 II

2017-05-03 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-16581:
---

 Summary: improve upon HIVE-16523 II
 Key: HIVE-16581
 URL: https://issues.apache.org/jira/browse/HIVE-16581
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Sergey Shelukhin


Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16582) HashTableLoader should log info about the input, rows, size etc.

2017-05-03 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-16582:


 Summary: HashTableLoader should log info about the input, rows, 
size etc.
 Key: HIVE-16582
 URL: https://issues.apache.org/jira/browse/HIVE-16582
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Minor


Will be useful to log the following info during hash table loading
- input name
- number of rows 
- estimated data size (LLAP tracks this)
- object cache key



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 50787: Add a timezone-aware timestamp

2017-05-03 Thread Rui Li


> On May 3, 2017, 9:57 p.m., pengcheng xiong wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
> > Lines 132 (patched)
> > 
> >
> > I think Identifier["timestamptz"] and Identifier["zone"] may be 
> > sufficient. It is not necessary to make them as key words and then add them 
> > back as identifiers. You can have a try and see if it works. Thanks..

Hi Pengcheng, sorry I'm quite ignorant about antlr. Could you please be more 
specific how to add the Identifiers? Let me explain what I intend to do. The 
new data type is named "timestamp with time zone", and "timestamptz" is added 
as a type alias. I thought it's required to add key words for type names. And 
according to the PostgreSQL doc we referenced 
(https://www.postgresql.org/docs/9.5/static/sql-keywords-appendix.html), "zone" 
is a non-reserved SQL key word and "timestamptz" is not a key word. So I added 
them in IdentifierParser.g as nonReserved.


- Rui


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50787/#review173833
---


On May 3, 2017, 6:34 a.m., Rui Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50787/
> ---
> 
> (Updated May 3, 2017, 6:34 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-14412
> https://issues.apache.org/jira/browse/HIVE-14412
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The 1st patch to add timezone-aware timestamp.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/TimestampTZ.java 
> PRE-CREATION 
>   common/src/test/org/apache/hadoop/hive/common/type/TestTimestampTZ.java 
> PRE-CREATION 
>   contrib/src/test/queries/clientnegative/serde_regex.q a676338 
>   contrib/src/test/queries/clientpositive/serde_regex.q d75d607 
>   contrib/src/test/results/clientnegative/serde_regex.q.out 58b1c02 
>   contrib/src/test/results/clientpositive/serde_regex.q.out 2984293 
>   hbase-handler/src/test/queries/positive/hbase_timestamp.q 0350afe 
>   hbase-handler/src/test/results/positive/hbase_timestamp.q.out 3918121 
>   itests/hive-blobstore/src/test/queries/clientpositive/orc_format_part.q 
> 358eccd 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/orc_nonstd_partitions_loc.q
>  c462538 
>   itests/hive-blobstore/src/test/queries/clientpositive/rcfile_format_part.q 
> c563d3a 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/rcfile_nonstd_partitions_loc.q
>  d17c281 
>   itests/hive-blobstore/src/test/results/clientpositive/orc_format_part.q.out 
> 5d1319f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/orc_nonstd_partitions_loc.q.out
>  70e72f7 
>   
> itests/hive-blobstore/src/test/results/clientpositive/rcfile_format_part.q.out
>  bed10ab 
>   
> itests/hive-blobstore/src/test/results/clientpositive/rcfile_nonstd_partitions_loc.q.out
>  c6442f9 
>   jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java ade1900 
>   jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java 38918f0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 8dc5f2e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java f8b55da 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java 
> 01a652d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/TypeConverter.java
>  38308c9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 0cf9205 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 0721b92 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g d98a663 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 8598fae 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> 8f8eab0 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java bda2050 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 7cdf2c3 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 5cacd59 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java 68d98f5 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java 
> 5a31e61 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToTimestampTZ.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/parse/TestSQL11ReservedKeyWordsNegative.java
>  0dc6b19 
>   ql/src/test/queries/clientnegative/serde_regex.q c9cfc7d 
>   ql/src/test/queries/clientnegative/serde_regex2.q a29bb9c 
>   ql/src/test/queries/clientnegative/serde_regex3.q 4e91f06 
>   ql/src/test/queries/clientpositive/create_like.q bd39731 
>   ql/src/test/queries/clientpositive/join43.q 12c45a6 
>   

Re: Review Request 58501: HIVE-16469: Parquet timestamp table property is not always taken into account

2017-05-03 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58501/#review173857
---




ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
Lines 372 (patched)


Can we check the format type to see whether it's Parquet format?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
Lines 101 (patched)


Can we rename this method like propagateParquetTimeZoneTablePorperty? 
Ususally method with prefix "check" should not have side-effect.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java
Line 181 (original), 181 (patched)


Why not passing in the default value here when 
PARQUET_INT96_WRITE_ZONE_PROPERTY is not set?


- cheng xu


On May 3, 2017, 8:59 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58501/
> ---
> 
> (Updated May 3, 2017, 8:59 p.m.)
> 
> 
> Review request for hive, Sergio Pena and Zoltan Ivanfi.
> 
> 
> Bugs: HIVE-16469
> https://issues.apache.org/jira/browse/HIVE-16469
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16469: Parquet timestamp table property is not always taken into account
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 757b7fc0eaa39c956014aa446ab1b07fc4abf8d3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 
> 13750cdc34711d22f2adf2f483a6773ad05fb8d2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
> 9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 
> 1bd4db7805689ae1f91921ffbb5ff7da59f4bf60 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
>  f4fadbb61bf45f62945700284c0b050f0984b696 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 2954601ce5bb25905cdb29ca0ca4551c2ca12b95 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
> 6413c5add6db2e8c9298285b15dba33ee74379a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java 
> b339cc4347eea143dca2f6d98f9aaafdc427 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 
> dbd6fb3d0bc8c753abf86e99b52377617f248b5a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java
>  c81499a91c84af3ba33f335506c1c44e7085f13d 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRowGroupFilter.java
>  bf363f32a3ac0a4d790e2925d802c6e210adfb4b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/VectorizedColumnReaderTestBase.java
>  f2d79cf9d215e9a6e2a5e88cfc78378be860fd1f 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java
>  1e10dbf18742524982606f1e6c6d447d683b2dc3 
>   ql/src/test/queries/clientnegative/parquet_int96_alter_invalid_timezone.q 
> PRE-CREATION 
>   ql/src/test/queries/clientnegative/parquet_int96_create_invalid_timezone.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 
> 6eadd1b0a3313cbba7a798890b802baae302749e 
>   
> ql/src/test/results/clientnegative/parquet_int96_alter_invalid_timezone.q.out 
> PRE-CREATION 
>   
> ql/src/test/results/clientnegative/parquet_int96_create_invalid_timezone.q.out
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_int96_timestamp.q.out 
> b9a3664458a83f1856e4bc59eba5d56665df61cc 
>   ql/src/test/results/clientpositive/spark/parquet_int96_timestamp.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/58501/diff/4/
> 
> 
> Testing
> ---
> 
> Added qtests for the following cases:
> - order by clause
> - selfjoin
> - calling UDFs with the timestamp values
> - where clause with a constant cast as timestamp
> - test for HoS
> - implicit and explicit timestamp conversions in insert clause
> 
> Tested manually but no qtests:
> - join between 3 tables all parquet but with different/no timezone property
> - subselect in from/where clauses
> - exists / union / no exists
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>



[jira] [Created] (HIVE-16573) In-place update for HoS can't be disabled

2017-05-03 Thread Rui Li (JIRA)
Rui Li created HIVE-16573:
-

 Summary: In-place update for HoS can't be disabled
 Key: HIVE-16573
 URL: https://issues.apache.org/jira/browse/HIVE-16573
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
Priority: Minor


{{hive.spark.exec.inplace.progress}} has no effect



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58865: HIVE-16552: Limit the number of tasks a Spark job may contain

2017-05-03 Thread Rui Li


> On May 3, 2017, 3:35 a.m., Rui Li wrote:
> >

Xuefu, the patch looks good to me overall. Thanks for the work. Do you think we 
should add some negative test case for it?


> On May 3, 2017, 3:35 a.m., Rui Li wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java
> > Lines 132 (patched)
> > 
> >
> > I think the log is unnecessary because the failure should already be 
> > logged in the monitor
> 
> Xuefu Zhang wrote:
> This is not new code.

Do you mean "LOG.info("Failed to submit Spark job " + sparkJobID);" is not new 
code? I don't find it in the current SparkTask.java.


> On May 3, 2017, 3:35 a.m., Rui Li wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java
> > Lines 135 (patched)
> > 
> >
> > Same as above. Can we consolidate the logs a bit?
> 
> Xuefu Zhang wrote:
> Jobmonitor prints it on console, while the log here is written to 
> hive.log.

The console.printInfo method does both printing and logging:

public void printInfo(String info, String detail, boolean isSilent) {
  if (!isSilent) {
getInfoStream().println(info);
  }
  LOG.info(info + StringUtils.defaultString(detail));
}


> On May 3, 2017, 3:35 a.m., Rui Li wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/RemoteSparkJobMonitor.java
> > Lines 104 (patched)
> > 
> >
> > Maybe I was being misleading. I mean we can compute the total task only 
> > once when the job first reaches RUNNING state, i.e. in the "if (!running)". 
> > At this point, the total count is determined and won't change.
> 
> Xuefu Zhang wrote:
> Yeah. However, I'd like to keep the state transition to running first 
> before breaking up and returning rc=4. In fact, if we lose the transition, 
> Hive actually goes into an instable state. What you said was what I tried in 
> first place.

I see. Thanks for the explanation.


- Rui


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58865/#review173689
---


On May 2, 2017, 6:49 p.m., Xuefu Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58865/
> ---
> 
> (Updated May 2, 2017, 6:49 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-16552
> https://issues.apache.org/jira/browse/HIVE-16552
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA description
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 84398c6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 32a7730 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/RemoteSparkJobMonitor.java
>  dd73f3e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobMonitor.java 
> 0b224f2 
> 
> 
> Diff: https://reviews.apache.org/r/58865/diff/3/
> 
> 
> Testing
> ---
> 
> Test locally
> 
> 
> Thanks,
> 
> Xuefu Zhang
> 
>



[jira] [Created] (HIVE-16574) Select count(*) throws ClassCastException in the presence of struct data type object

2017-05-03 Thread Mahesh Nayak (JIRA)
Mahesh Nayak created HIVE-16574:
---

 Summary: Select count(*) throws ClassCastException in the presence 
of struct data type object
 Key: HIVE-16574
 URL: https://issues.apache.org/jira/browse/HIVE-16574
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Mahesh Nayak


On executing select count(*) on a table containing struct data type the below 
exception is thrown

{code:None}
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1487048408006_0538_2_00, 
diagnostics=[Task failed, taskId=task_1487048408006_0538_2_00_01, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.io.IOException: error iterating
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.io.IOException: error iterating
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
... 14 more
Caused by: java.io.IOException: java.io.IOException: error iterating
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:141)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
... 16 more
Caused by: java.io.IOException: error iterating
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:92)
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:42)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
... 22 more
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcListObjectInspector cannot be 
cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:297)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.acidAddRowToBatch(VectorizedBatchUtil.java:277)
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:82)
... 24 more
], TaskAttempt 1 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.io.IOException: error iterating
at 

[jira] [Created] (HIVE-16583) Select query fails with some exception error

2017-05-03 Thread Atul (JIRA)
Atul created HIVE-16583:
---

 Summary: Select query fails with some exception error
 Key: HIVE-16583
 URL: https://issues.apache.org/jira/browse/HIVE-16583
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Atul
Priority: Critical




After loading the ORC file to Hive table, My select * query gets aborted with 
below error:

Fetching results ran into the following error(s):

Bad status for request TFetchResultsReq(fetchType=0, 
operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, 
operationType=0, 
operationId=THandleIdentifier(secret='f\xfa\xf32\xd1KB\xbb\xb9\t\xe8\x1c\xd1\x01\xa5\xf2',
 guid=')\x17\r4\x7f D\xcf\xa0Z\xff\x8a70\xaa\x93')), orientation=4, 
maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0, 
errorMessage='java.io.IOException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException', sqlState=None, 
infoMessages=['*org.apache.hive.service.cli.HiveSQLException:java.io.IOException:
 org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException:25:24', 
'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:352',
 
'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:221',
 
'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:707',
 'sun.reflect.GeneratedMethodAccessor45:invoke::-1', 
'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
 'java.lang.reflect.Method:invoke:Method.java:606', 
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78',
 
'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
 
'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
 'java.security.AccessController:doPrivileged:AccessController.java:-2', 
'javax.security.auth.Subject:doAs:Subject.java:415', 
'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1692',
 
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59',
 'com.sun.proxy.$Proxy45:fetchResults::-1', 
'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:454', 
'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:672',
 
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
 
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538',
 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 
'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 
'org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor:process:HadoopThriftAuthBridge.java:692',
 
'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
 
'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
 
'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
 'java.lang.Thread:run:Thread.java:745', 
'*java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException:27:2', 
'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:154', 
'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1720', 
'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:347',
 
'*org.apache.hadoop.hive.ql.metadata.HiveException:java.lang.ClassCastException:34:7',
 
'org.apache.hadoop.hive.ql.exec.ListSinkOperator:process:ListSinkOperator.java:93',
 'org.apache.hadoop.hive.ql.exec.Operator:forward:Operator.java:838', 
'org.apache.hadoop.hive.ql.exec.SelectOperator:process:SelectOperator.java:88', 
'org.apache.hadoop.hive.ql.exec.Operator:forward:Operator.java:838', 
'org.apache.hadoop.hive.ql.exec.TableScanOperator:process:TableScanOperator.java:97',
 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:425', 
'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:417', 
'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:140', 
'*java.lang.ClassCastException:null:0:-1'], statusCode=3), results=None, 
hasMoreRows=None)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)