Re: Review Request 65342: HIVE-18546

2018-02-01 Thread Gopal V

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65342/#review196692
---




ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
Line 5127 (original), 5129 (patched)


This is a list of txns which is longer than what we care about, because it 
contains a list of all aborted txns which haven't been cleaned up yet.

The relevant snapshot for the use-case is just the txns which are currently 
writing and may be committed in the future.

If there are 100k aborted txns, it is relevant for the reader query, but 
not for the mview txn state.


- Gopal V


On Jan. 31, 2018, 12:07 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65342/
> ---
> 
> (Updated Jan. 31, 2018, 12:07 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18546
> https://issues.apache.org/jira/browse/HIVE-18546
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18546
> 
> 
> Diffs
> -
> 
>   metastore/scripts/upgrade/derby/048-HIVE-14498.derby.sql 
> 4ffd054530503681de1c9f6d65f8187fc1b7520d 
>   metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql 
> 6a59b0df712c8a9f9be880cec5fd8c8eddda4a7d 
>   metastore/scripts/upgrade/derby/hive-txn-schema-3.0.0.derby.sql 
> d72b06cb5866edf93dbcbb20268fc899439e5c43 
>   metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 
> eb4f0124b5a7829e58d5e9a6a604c201ccea998a 
>   metastore/scripts/upgrade/mssql/033-HIVE-14498.mssql.sql 
> 3a47600bb09e2c20cc12f8759e1287001367604e 
>   metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 
> c45bb3e323c640223b19831abbf4e806c3019f0b 
>   metastore/scripts/upgrade/mysql/048-HIVE-14498.mysql.sql 
> 986eaf5272eab560fa2f862910aaf74c5332c716 
>   metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 
> 01c995d632d94a8f9cc3f46f94c54290abb3da13 
>   metastore/scripts/upgrade/mysql/hive-txn-schema-3.0.0.mysql.sql 
> 497846f994d431d8717aea36d4ad569892e3c8c3 
>   metastore/scripts/upgrade/oracle/048-HIVE-14498.oracle.sql 
> 0b01e89d92f7f48439024aeb326d675d123f0f8c 
>   metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 
> e1aee6fb6c84999b17f87f80750582fafeae063f 
>   metastore/scripts/upgrade/oracle/hive-txn-schema-3.0.0.oracle.sql 
> 5411bc47103f901623244bc26c0ace87e10ad2e1 
>   metastore/scripts/upgrade/postgres/047-HIVE-14498.postgres.sql 
> 8d4de8870d93bab49c873cab44e6714b93491744 
>   metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 
> 28cb01684a46aaeea40d7cbe1973d7bc20810988 
>   metastore/scripts/upgrade/postgres/hive-txn-schema-3.0.0.postgres.sql 
> a81d6eec6d6235706f1225d541f8290971cc6215 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 51ef39057434c41fbe760c547e3bf231e65e4cc0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
> 9b0ffe0e91db05ae623531248f12745266789a11 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> d159e4bed1cd4ff04bed1c397318bc2951c02a51 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> dbf9363d11deea5377808094d7cb3331ee0f999f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 
> 97baf25ea8bbe6f55e46e2ea5bdf33a5a71eecdf 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ImportTableDesc.java 
> 3535fa4d02106f4e96af6c33ffa291c8db21e3bc 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
> aa95d2fcdcd0b3cede35537a1d7d041ee738e4a8 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out 
> 5ed427fd2aa6fbb83877031e6692bd8f1994730d 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
> 42bc9297e72ac8fd77352cb786cfed3abf5af59b 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
> 8b78230a32d4d4339189c1db4b533ed04ec080af 
>   
> standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
>  6a2ff6c4c681b2dbaf339b214663212a2e6dab22 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> df646a7d1771892e4404be5c4fba183c0f914510 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 27f8c0f2fcb24a90be8a44d68947589004286c28 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AbortTxnsRequest.java
>  398f8d4e93c6077c110e6469bcd3715fdad5a634 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddDynamicPartitions.java
>  2102aa5215598edfe5e5c53d541c4fe02ebc7f09 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddForeignKeyRequest.java
>  a2225298e72f708e97324048592c37a308e43514 
>   
> 

Re: Review Request 65342: HIVE-18546

2018-02-01 Thread Ashutosh Chauhan


> On Feb. 2, 2018, 1:31 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Lines 1238 (patched)
> > 
> >
> > If its a MV, all metadata related to it is erased. Thats good. But if 
> > its a table participating in MV, then should dropping of such table also 
> > result in automatic drop of all MV its part of? Not sure how other DBs 
> > handle it. But we should note this in comments here.
> 
> Jesús Camacho Rodríguez wrote:
> I planned to create a follow-up for this. Maybe failing unless using a 
> cascade option? I will check how it is done in other RDBMs.

Follow-up is ok.


> On Feb. 2, 2018, 1:31 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Line 1914 (original), 1945 (patched)
> > 
> >
> > Shall we do validation of length for txnlist and throw if its bigger?
> 
> Jesús Camacho Rodríguez wrote:
> If we try to write a txnList longer than the CLOB supported size for the 
> backing RDBMs, the write will fail. I will specify the length of the CLOB at 
> the RDBMs DDL level so limit is exactly the same for all metastore RDBMs. 
> Introducing the limit check here seems redundant, since write will fail. What 
> do you think?

I am not sure text, clob etc. across different DBs have same length. However, 
it will fail if its more than that whatever that length is. So, agreed check is 
redundant.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65342/#review196673
---


On Jan. 31, 2018, 12:07 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65342/
> ---
> 
> (Updated Jan. 31, 2018, 12:07 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18546
> https://issues.apache.org/jira/browse/HIVE-18546
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18546
> 
> 
> Diffs
> -
> 
>   metastore/scripts/upgrade/derby/048-HIVE-14498.derby.sql 
> 4ffd054530503681de1c9f6d65f8187fc1b7520d 
>   metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql 
> 6a59b0df712c8a9f9be880cec5fd8c8eddda4a7d 
>   metastore/scripts/upgrade/derby/hive-txn-schema-3.0.0.derby.sql 
> d72b06cb5866edf93dbcbb20268fc899439e5c43 
>   metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 
> eb4f0124b5a7829e58d5e9a6a604c201ccea998a 
>   metastore/scripts/upgrade/mssql/033-HIVE-14498.mssql.sql 
> 3a47600bb09e2c20cc12f8759e1287001367604e 
>   metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 
> c45bb3e323c640223b19831abbf4e806c3019f0b 
>   metastore/scripts/upgrade/mysql/048-HIVE-14498.mysql.sql 
> 986eaf5272eab560fa2f862910aaf74c5332c716 
>   metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 
> 01c995d632d94a8f9cc3f46f94c54290abb3da13 
>   metastore/scripts/upgrade/mysql/hive-txn-schema-3.0.0.mysql.sql 
> 497846f994d431d8717aea36d4ad569892e3c8c3 
>   metastore/scripts/upgrade/oracle/048-HIVE-14498.oracle.sql 
> 0b01e89d92f7f48439024aeb326d675d123f0f8c 
>   metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 
> e1aee6fb6c84999b17f87f80750582fafeae063f 
>   metastore/scripts/upgrade/oracle/hive-txn-schema-3.0.0.oracle.sql 
> 5411bc47103f901623244bc26c0ace87e10ad2e1 
>   metastore/scripts/upgrade/postgres/047-HIVE-14498.postgres.sql 
> 8d4de8870d93bab49c873cab44e6714b93491744 
>   metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 
> 28cb01684a46aaeea40d7cbe1973d7bc20810988 
>   metastore/scripts/upgrade/postgres/hive-txn-schema-3.0.0.postgres.sql 
> a81d6eec6d6235706f1225d541f8290971cc6215 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 51ef39057434c41fbe760c547e3bf231e65e4cc0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
> 9b0ffe0e91db05ae623531248f12745266789a11 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> d159e4bed1cd4ff04bed1c397318bc2951c02a51 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> dbf9363d11deea5377808094d7cb3331ee0f999f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 
> 97baf25ea8bbe6f55e46e2ea5bdf33a5a71eecdf 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ImportTableDesc.java 
> 3535fa4d02106f4e96af6c33ffa291c8db21e3bc 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
> aa95d2fcdcd0b3cede35537a1d7d041ee738e4a8 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out 
> 5ed427fd2aa6fbb83877031e6692bd8f1994730d 
>   

[jira] [Created] (HIVE-18611) Avoid memory allocation of aggregation buffer during stats computation

2018-02-01 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-18611:
---

 Summary: Avoid memory allocation of aggregation buffer during 
stats computation 
 Key: HIVE-18611
 URL: https://issues.apache.org/jira/browse/HIVE-18611
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer, Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Bloom filter aggregation buffer may result in allocation of upto ~594MB array 
which is unnecessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #304: HIVE-18581: Replication events should use lower case...

2018-02-01 Thread anishek
Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/304


---


[jira] [Created] (HIVE-18610) Performance: ListKeyWrapper does not check for hashcode equals, before comparing members

2018-02-01 Thread Gopal V (JIRA)
Gopal V created HIVE-18610:
--

 Summary: Performance: ListKeyWrapper does not check for hashcode 
equals, before comparing members
 Key: HIVE-18610
 URL: https://issues.apache.org/jira/browse/HIVE-18610
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V


ListKeyWrapper::equals() 

{code}
@Override
public boolean equals(Object obj) {
  if (!(obj instanceof ListKeyWrapper)) {
return false;
  }
  Object[] copied_in_hashmap = ((ListKeyWrapper) obj).keys;
  return equalComparer.areEqual(copied_in_hashmap, keys);
}
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Yetus JDK version

2018-02-01 Thread Thejas Nair
+ Peter, Adam (The Yetus experts)


On Thu, Feb 1, 2018 at 9:49 AM, Alan Gates  wrote:

> Ok, looking briefly at it, it looks like if we changed
> testutils/…/TestScripts.java line 76 to set javaHome to 1.8 instead of 1.7
> that we’ll be running ptest with 1.8.  I’m not familiar with ptest, but I’m
> guessing that someone would need to make this change and then re-deploy
> ptest in our test infrastructure.  Is there anything else we need to do?
> We clearly already have 1.8 installed on the test machines because the code
> compiles with 1.8, but I don’t know what the path is, etc.
>
> Alan.
>
> On Tue, Jan 30, 2018 at 4:58 PM, Alan Gates  wrote:
>
> > I put code in the latest patch for HIVE-17983 that executes one of the
> > compiled classes as part of the maven build.  (It does this to
> > automatically generate the config template.)  This works locally and in
> the
> > ptest build.  But in the Yetus tests it fails with:
> >
> > Exception in thread "main" java.lang.UnsupportedClassVersionError:
> > org/apache/hadoop/hive/metastore/conf/ConfTemplatePrinter : Unsupported
> > major.minor version 52.0
> >
> > This means that it is compiling with JDK 1.8 but running it with 1.7.
> How
> > do we switch the Yetus build so it runs maven with the correct JDK
> version?
> >
> > Alan.
> >
>


Re: Intellij + Checkstyle setup

2018-02-01 Thread Vineet Garg
Hi Vihang,

I am unable to import eclipse-styles.xml. I am using IntelliJ 2017.3 with 
Checkstyle 8.7. May be you are using different version?

Vineet

> On Feb 1, 2018, at 5:58 PM, Vihang Karajgaonkar  wrote:
> 
> Thanks for sharing Vineet. How is this different than
> https://github.com/apache/hive/blob/master/dev-support/eclipse-styles.xml
> 
> I was able to import this in IntelliJ just fine.
> 
> 
> On Thu, Feb 1, 2018 at 4:42 PM, Vineet Garg  wrote:
> 
>> Hi,
>> 
>> If you would like to use Intellij’s checkstyle plugin and are unable to
>> import current checkstyle.xml under HIVE_REPO/checkstyle.xml file use a
>> modified version located at:
>> https://github.com/vineetgarg02/misc/blob/master/hive_tools/checkstyle.xml
>> 
>> Current format seems to be unsupported with checkstyle plugin.
>> 
>> Thanks,
>> Vineet
>> 



Re: Review Request 65304: HIVE-18513 Query results caching

2018-02-01 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65304/#review196680
---


Ship it!




Ship It!

- Jesús Camacho Rodríguez


On Feb. 2, 2018, 2:18 a.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65304/
> ---
> 
> (Updated Feb. 2, 2018, 2:18 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, and Jesús Camacho 
> Rodríguez.
> 
> 
> Bugs: HIVE-18513
> https://issues.apache.org/jira/browse/HIVE-18513
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - For queries that result in MR/Tez/Spark jobs on the cluster, save the 
> temporary query results to a cache directory where they can be re-used.
> - Add QueryResultsCache to manage cached results. Currently cache 
> invalidation is time-based, update-based cache invalidation needs to be added 
> later.
> - Driver/SemanticAnalyzer/Calcite planner changes to lookup queries in the 
> cache and use in place of the query plan.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99 
>   common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java 2767bca 
>   data/conf/hive-site.xml 01f83d1 
>   data/conf/llap/hive-site.xml cdda875 
>   data/conf/perf-reg/spark/hive-site.xml 497a61f 
>   data/conf/perf-reg/tez/hive-site.xml 012369f 
>   data/conf/rlist/hive-site.xml 9de00e5 
>   data/conf/spark/local/hive-site.xml fd0e6a0 
>   data/conf/spark/standalone/hive-site.xml 1e5bd65 
>   data/conf/spark/yarn-client/hive-site.xml a9a788b 
>   data/conf/tez/hive-site.xml 4519678 
>   itests/hive-blobstore/src/test/resources/hive-site.xml 038db0d 
>   itests/src/test/resources/testconfiguration.properties d86ff58 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 4432aca 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b0 
>   ql/src/java/org/apache/hadoop/hive/ql/cache/results/CacheUsage.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7348faa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
> f0dd167 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
> 372cfad 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 85a1f34 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java ae2ec3d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java dbf9363 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 7243dc7 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 3946d4a 
>   ql/src/test/queries/clientpositive/results_cache_1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_capacity.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_lifetime.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_temptable.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_with_masking.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/results_cache_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_capacity.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_lifetime.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_temptable.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_with_masking.q.out 
> PRE-CREATION 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 2a528cd 
> 
> 
> Diff: https://reviews.apache.org/r/65304/diff/6/
> 
> 
> Testing
> ---
> 
> qfile tests added.
> 
> 
> Thanks,
> 
> Jason Dere
> 
>



Re: Review Request 65304: HIVE-18513 Query results caching

2018-02-01 Thread Jesús Camacho Rodríguez


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
> > Lines 174 (patched)
> > 
> >
> > What happens if execution fails? Will the results still be cleaned 
> > properly?
> 
> Jason Dere wrote:
> This is being called in Driver.closeInProcess()/Driver.close(), so my 
> impression was this should be the case.
> 
> Jesús Camacho Rodríguez wrote:
> Makes sense. What if HS2 fails or closes? Should we add it to the 
> shutdown hook so it cleans it even if readers number is greater than zero? 
> (Not sure about the order in which things happen in this case, just want to 
> make sure we do not leave any residual data/dirs).
> 
> Jason Dere wrote:
> The QueryResultsCache initialization calls deleteOnExit() for the the 
> results cache directory, so all subdirectories of this directory 
> (representing cached results) should be deleted on shutdown regardless of the 
> number of readers.
> 
> One thing to improve here might be a way to cleanup any previous Hive 
> instances where the process exited abnormally and may not have been able to 
> run the shutdown hook. I'll open another item for that.

Sounds good, please do that.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65304/#review196440
---


On Feb. 2, 2018, 2:18 a.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65304/
> ---
> 
> (Updated Feb. 2, 2018, 2:18 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, and Jesús Camacho 
> Rodríguez.
> 
> 
> Bugs: HIVE-18513
> https://issues.apache.org/jira/browse/HIVE-18513
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - For queries that result in MR/Tez/Spark jobs on the cluster, save the 
> temporary query results to a cache directory where they can be re-used.
> - Add QueryResultsCache to manage cached results. Currently cache 
> invalidation is time-based, update-based cache invalidation needs to be added 
> later.
> - Driver/SemanticAnalyzer/Calcite planner changes to lookup queries in the 
> cache and use in place of the query plan.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99 
>   common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java 2767bca 
>   data/conf/hive-site.xml 01f83d1 
>   data/conf/llap/hive-site.xml cdda875 
>   data/conf/perf-reg/spark/hive-site.xml 497a61f 
>   data/conf/perf-reg/tez/hive-site.xml 012369f 
>   data/conf/rlist/hive-site.xml 9de00e5 
>   data/conf/spark/local/hive-site.xml fd0e6a0 
>   data/conf/spark/standalone/hive-site.xml 1e5bd65 
>   data/conf/spark/yarn-client/hive-site.xml a9a788b 
>   data/conf/tez/hive-site.xml 4519678 
>   itests/hive-blobstore/src/test/resources/hive-site.xml 038db0d 
>   itests/src/test/resources/testconfiguration.properties d86ff58 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 4432aca 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b0 
>   ql/src/java/org/apache/hadoop/hive/ql/cache/results/CacheUsage.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7348faa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
> f0dd167 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
> 372cfad 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 85a1f34 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java ae2ec3d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java dbf9363 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 7243dc7 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 3946d4a 
>   ql/src/test/queries/clientpositive/results_cache_1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_capacity.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_lifetime.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_temptable.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_with_masking.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/results_cache_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_1.q.out PRE-CREATION 
>   

Re: Review Request 65304: HIVE-18513 Query results caching

2018-02-01 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65304/
---

(Updated Feb. 2, 2018, 2:18 a.m.)


Review request for hive, Ashutosh Chauhan, Gopal V, and Jesús Camacho Rodríguez.


Changes
---

previous patch did not have the fixes, re-attaching patch


Bugs: HIVE-18513
https://issues.apache.org/jira/browse/HIVE-18513


Repository: hive-git


Description
---

- For queries that result in MR/Tez/Spark jobs on the cluster, save the 
temporary query results to a cache directory where they can be re-used.
- Add QueryResultsCache to manage cached results. Currently cache invalidation 
is time-based, update-based cache invalidation needs to be added later.
- Driver/SemanticAnalyzer/Calcite planner changes to lookup queries in the 
cache and use in place of the query plan.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99 
  common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java 2767bca 
  data/conf/hive-site.xml 01f83d1 
  data/conf/llap/hive-site.xml cdda875 
  data/conf/perf-reg/spark/hive-site.xml 497a61f 
  data/conf/perf-reg/tez/hive-site.xml 012369f 
  data/conf/rlist/hive-site.xml 9de00e5 
  data/conf/spark/local/hive-site.xml fd0e6a0 
  data/conf/spark/standalone/hive-site.xml 1e5bd65 
  data/conf/spark/yarn-client/hive-site.xml a9a788b 
  data/conf/tez/hive-site.xml 4519678 
  itests/hive-blobstore/src/test/resources/hive-site.xml 038db0d 
  itests/src/test/resources/testconfiguration.properties d86ff58 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 4432aca 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b0 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/CacheUsage.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7348faa 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
f0dd167 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 372cfad 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 85a1f34 
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java ae2ec3d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java dbf9363 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 7243dc7 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 3946d4a 
  ql/src/test/queries/clientpositive/results_cache_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_capacity.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_lifetime.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_temptable.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_with_masking.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/results_cache_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_capacity.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_lifetime.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_temptable.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_with_masking.q.out 
PRE-CREATION 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 2a528cd 


Diff: https://reviews.apache.org/r/65304/diff/6/

Changes: https://reviews.apache.org/r/65304/diff/5-6/


Testing
---

qfile tests added.


Thanks,

Jason Dere



[jira] [Created] (HIVE-18609) Results cache invalidation based on table updates

2018-02-01 Thread Jason Dere (JIRA)
Jason Dere created HIVE-18609:
-

 Summary: Results cache invalidation based on table updates
 Key: HIVE-18609
 URL: https://issues.apache.org/jira/browse/HIVE-18609
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere


Look into using the materialized view invalidation mechanisms to automatically 
invalidate queries in the results cache if the underlying tables used in the 
cached queries have been modified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Intellij + Checkstyle setup

2018-02-01 Thread Vihang Karajgaonkar
Thanks for sharing Vineet. How is this different than
https://github.com/apache/hive/blob/master/dev-support/eclipse-styles.xml

I was able to import this in IntelliJ just fine.


On Thu, Feb 1, 2018 at 4:42 PM, Vineet Garg  wrote:

> Hi,
>
> If you would like to use Intellij’s checkstyle plugin and are unable to
> import current checkstyle.xml under HIVE_REPO/checkstyle.xml file use a
> modified version located at:
> https://github.com/vineetgarg02/misc/blob/master/hive_tools/checkstyle.xml
>
> Current format seems to be unsupported with checkstyle plugin.
>
> Thanks,
> Vineet
>


Re: Review Request 65304: HIVE-18513 Query results caching

2018-02-01 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65304/
---

(Updated Feb. 2, 2018, 1:55 a.m.)


Review request for hive, Ashutosh Chauhan, Gopal V, and Jesús Camacho Rodríguez.


Changes
---

Changes per review comments.


Bugs: HIVE-18513
https://issues.apache.org/jira/browse/HIVE-18513


Repository: hive-git


Description
---

- For queries that result in MR/Tez/Spark jobs on the cluster, save the 
temporary query results to a cache directory where they can be re-used.
- Add QueryResultsCache to manage cached results. Currently cache invalidation 
is time-based, update-based cache invalidation needs to be added later.
- Driver/SemanticAnalyzer/Calcite planner changes to lookup queries in the 
cache and use in place of the query plan.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99 
  common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java 2767bca 
  data/conf/hive-site.xml 01f83d1 
  data/conf/llap/hive-site.xml cdda875 
  data/conf/perf-reg/spark/hive-site.xml 497a61f 
  data/conf/perf-reg/tez/hive-site.xml 012369f 
  data/conf/rlist/hive-site.xml 9de00e5 
  data/conf/spark/local/hive-site.xml fd0e6a0 
  data/conf/spark/standalone/hive-site.xml 1e5bd65 
  data/conf/spark/yarn-client/hive-site.xml a9a788b 
  data/conf/tez/hive-site.xml 4519678 
  itests/hive-blobstore/src/test/resources/hive-site.xml 038db0d 
  itests/src/test/resources/testconfiguration.properties d86ff58 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 4432aca 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b0 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/CacheUsage.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7348faa 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
f0dd167 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 372cfad 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 85a1f34 
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java ae2ec3d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java dbf9363 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 7243dc7 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 3946d4a 
  ql/src/test/queries/clientpositive/results_cache_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_capacity.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_lifetime.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_temptable.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_with_masking.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/results_cache_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_capacity.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_lifetime.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_temptable.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_with_masking.q.out 
PRE-CREATION 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 2a528cd 


Diff: https://reviews.apache.org/r/65304/diff/5/

Changes: https://reviews.apache.org/r/65304/diff/4-5/


Testing
---

qfile tests added.


Thanks,

Jason Dere



[jira] [Created] (HIVE-18608) ORC should allow selectively disabling dictionary-encoding on specified columns

2018-02-01 Thread Mithun Radhakrishnan (JIRA)
Mithun Radhakrishnan created HIVE-18608:
---

 Summary: ORC should allow selectively disabling 
dictionary-encoding on specified columns
 Key: HIVE-18608
 URL: https://issues.apache.org/jira/browse/HIVE-18608
 Project: Hive
  Issue Type: New Feature
  Components: ORC
Affects Versions: 3.0.0, 2.4.0, 2.2.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan


Just as ORC allows the choice of columns to enable bloom-filters on, it would 
be nice to have a way to specify which columns {{DICTIONARY_V2}} encoding 
should be disabled on.

Currently, the choice of dictionary-encoding depends on the results of sampling 
the first row-stride within a stripe. If the user knows that a column's 
cardinality is bound to prevent an effective dictionary, she might choose to 
simply disable it on just that column, and avoid the cost of sampling in the 
first row-stride.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65342: HIVE-18546

2018-02-01 Thread Jesús Camacho Rodríguez


> On Feb. 2, 2018, 1:31 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Lines 1238 (patched)
> > 
> >
> > If its a MV, all metadata related to it is erased. Thats good. But if 
> > its a table participating in MV, then should dropping of such table also 
> > result in automatic drop of all MV its part of? Not sure how other DBs 
> > handle it. But we should note this in comments here.

I planned to create a follow-up for this. Maybe failing unless using a cascade 
option? I will check how it is done in other RDBMs.


> On Feb. 2, 2018, 1:31 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Line 1914 (original), 1945 (patched)
> > 
> >
> > Shall we do validation of length for txnlist and throw if its bigger?

If we try to write a txnList longer than the CLOB supported size for the 
backing RDBMs, the write will fail. I will specify the length of the CLOB at 
the RDBMs DDL level so limit is exactly the same for all metastore RDBMs. 
Introducing the limit check here seems redundant, since write will fail. What 
do you think?


> On Feb. 2, 2018, 1:31 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/MCreationMetadata.java
> > Lines 22 (patched)
> > 
> >
> > Good to javadoc for class.

I will, thanks.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65342/#review196673
---


On Jan. 31, 2018, 12:07 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65342/
> ---
> 
> (Updated Jan. 31, 2018, 12:07 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18546
> https://issues.apache.org/jira/browse/HIVE-18546
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18546
> 
> 
> Diffs
> -
> 
>   metastore/scripts/upgrade/derby/048-HIVE-14498.derby.sql 
> 4ffd054530503681de1c9f6d65f8187fc1b7520d 
>   metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql 
> 6a59b0df712c8a9f9be880cec5fd8c8eddda4a7d 
>   metastore/scripts/upgrade/derby/hive-txn-schema-3.0.0.derby.sql 
> d72b06cb5866edf93dbcbb20268fc899439e5c43 
>   metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 
> eb4f0124b5a7829e58d5e9a6a604c201ccea998a 
>   metastore/scripts/upgrade/mssql/033-HIVE-14498.mssql.sql 
> 3a47600bb09e2c20cc12f8759e1287001367604e 
>   metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 
> c45bb3e323c640223b19831abbf4e806c3019f0b 
>   metastore/scripts/upgrade/mysql/048-HIVE-14498.mysql.sql 
> 986eaf5272eab560fa2f862910aaf74c5332c716 
>   metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 
> 01c995d632d94a8f9cc3f46f94c54290abb3da13 
>   metastore/scripts/upgrade/mysql/hive-txn-schema-3.0.0.mysql.sql 
> 497846f994d431d8717aea36d4ad569892e3c8c3 
>   metastore/scripts/upgrade/oracle/048-HIVE-14498.oracle.sql 
> 0b01e89d92f7f48439024aeb326d675d123f0f8c 
>   metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 
> e1aee6fb6c84999b17f87f80750582fafeae063f 
>   metastore/scripts/upgrade/oracle/hive-txn-schema-3.0.0.oracle.sql 
> 5411bc47103f901623244bc26c0ace87e10ad2e1 
>   metastore/scripts/upgrade/postgres/047-HIVE-14498.postgres.sql 
> 8d4de8870d93bab49c873cab44e6714b93491744 
>   metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 
> 28cb01684a46aaeea40d7cbe1973d7bc20810988 
>   metastore/scripts/upgrade/postgres/hive-txn-schema-3.0.0.postgres.sql 
> a81d6eec6d6235706f1225d541f8290971cc6215 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 51ef39057434c41fbe760c547e3bf231e65e4cc0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
> 9b0ffe0e91db05ae623531248f12745266789a11 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> d159e4bed1cd4ff04bed1c397318bc2951c02a51 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> dbf9363d11deea5377808094d7cb3331ee0f999f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 
> 97baf25ea8bbe6f55e46e2ea5bdf33a5a71eecdf 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ImportTableDesc.java 
> 3535fa4d02106f4e96af6c33ffa291c8db21e3bc 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
> aa95d2fcdcd0b3cede35537a1d7d041ee738e4a8 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out 
> 

Re: Review Request 65342: HIVE-18546

2018-02-01 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65342/#review196673
---




standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 1238 (patched)


If its a MV, all metadata related to it is erased. Thats good. But if its a 
table participating in MV, then should dropping of such table also result in 
automatic drop of all MV its part of? Not sure how other DBs handle it. But we 
should note this in comments here.



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 1914 (original), 1945 (patched)


Shall we do validation of length for txnlist and throw if its bigger?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/MCreationMetadata.java
Lines 22 (patched)


Good to javadoc for class.


- Ashutosh Chauhan


On Jan. 31, 2018, 12:07 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65342/
> ---
> 
> (Updated Jan. 31, 2018, 12:07 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18546
> https://issues.apache.org/jira/browse/HIVE-18546
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18546
> 
> 
> Diffs
> -
> 
>   metastore/scripts/upgrade/derby/048-HIVE-14498.derby.sql 
> 4ffd054530503681de1c9f6d65f8187fc1b7520d 
>   metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql 
> 6a59b0df712c8a9f9be880cec5fd8c8eddda4a7d 
>   metastore/scripts/upgrade/derby/hive-txn-schema-3.0.0.derby.sql 
> d72b06cb5866edf93dbcbb20268fc899439e5c43 
>   metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 
> eb4f0124b5a7829e58d5e9a6a604c201ccea998a 
>   metastore/scripts/upgrade/mssql/033-HIVE-14498.mssql.sql 
> 3a47600bb09e2c20cc12f8759e1287001367604e 
>   metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 
> c45bb3e323c640223b19831abbf4e806c3019f0b 
>   metastore/scripts/upgrade/mysql/048-HIVE-14498.mysql.sql 
> 986eaf5272eab560fa2f862910aaf74c5332c716 
>   metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 
> 01c995d632d94a8f9cc3f46f94c54290abb3da13 
>   metastore/scripts/upgrade/mysql/hive-txn-schema-3.0.0.mysql.sql 
> 497846f994d431d8717aea36d4ad569892e3c8c3 
>   metastore/scripts/upgrade/oracle/048-HIVE-14498.oracle.sql 
> 0b01e89d92f7f48439024aeb326d675d123f0f8c 
>   metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 
> e1aee6fb6c84999b17f87f80750582fafeae063f 
>   metastore/scripts/upgrade/oracle/hive-txn-schema-3.0.0.oracle.sql 
> 5411bc47103f901623244bc26c0ace87e10ad2e1 
>   metastore/scripts/upgrade/postgres/047-HIVE-14498.postgres.sql 
> 8d4de8870d93bab49c873cab44e6714b93491744 
>   metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 
> 28cb01684a46aaeea40d7cbe1973d7bc20810988 
>   metastore/scripts/upgrade/postgres/hive-txn-schema-3.0.0.postgres.sql 
> a81d6eec6d6235706f1225d541f8290971cc6215 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 51ef39057434c41fbe760c547e3bf231e65e4cc0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
> 9b0ffe0e91db05ae623531248f12745266789a11 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> d159e4bed1cd4ff04bed1c397318bc2951c02a51 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> dbf9363d11deea5377808094d7cb3331ee0f999f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 
> 97baf25ea8bbe6f55e46e2ea5bdf33a5a71eecdf 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ImportTableDesc.java 
> 3535fa4d02106f4e96af6c33ffa291c8db21e3bc 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
> aa95d2fcdcd0b3cede35537a1d7d041ee738e4a8 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out 
> 5ed427fd2aa6fbb83877031e6692bd8f1994730d 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
> 42bc9297e72ac8fd77352cb786cfed3abf5af59b 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
> 8b78230a32d4d4339189c1db4b533ed04ec080af 
>   
> standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
>  6a2ff6c4c681b2dbaf339b214663212a2e6dab22 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> df646a7d1771892e4404be5c4fba183c0f914510 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 27f8c0f2fcb24a90be8a44d68947589004286c28 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AbortTxnsRequest.java
>  

Intellij + Checkstyle setup

2018-02-01 Thread Vineet Garg
Hi,

If you would like to use Intellij’s checkstyle plugin and are unable to import 
current checkstyle.xml under HIVE_REPO/checkstyle.xml file use a modified 
version located at:
https://github.com/vineetgarg02/misc/blob/master/hive_tools/checkstyle.xml

Current format seems to be unsupported with checkstyle plugin.

Thanks,
Vineet


Re: Review Request 65130: HIVE-18350 : load data should rename files consistent with insert statements

2018-02-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65130/
---

(Updated Feb. 2, 2018, 12:38 a.m.)


Review request for hive, Eugene Koifman, Gopal V, Jason Dere, and Thejas Nair.


Changes
---

Made 1 mistake in previous patch. Fixed it in this one. Sorry about that.


Bugs: HIVE-18350
https://issues.apache.org/jira/browse/HIVE-18350


Repository: hive-git


Description
---

Made changes for both bucketed and non-bucketed tables.
Added a positive test for non-bucketed table which renames the loaded file.
Added couple of negative tests for bucketed table which reject a load with 
inconsistent file name.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
26afe90faa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java 
ef5e7edcd6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 9885038588 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 9b0ffe0e91 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
dc698c8de8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 69d9f3125a 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
 bacc44482a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java 9621c3be53 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q e5fdcb57e4 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q abf09e5534 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q b85c4a7aa3 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q bd780861e3 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out b9c2e6f827 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 5cfc35aa73 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 0d586fd26b 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 45704d1253 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 1959075912 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out 
054b0d00be 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_4.q.out 
95d329862c 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_5.q.out 
e711715aa5 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_7.q.out 
53c685cb11 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 
8cfa113794 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 
fce5e0cfc4 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 
8250eca099 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out 
eb813c1734 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h df646a7d17 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
27f8c0f2fc 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 f317b0393f 
  standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 6878ee1be7 
  standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
25e9a889b2 
  standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 3a11a0582a 
  standalone-metastore/src/main/thrift/hive_metastore.thrift 93f3e53de2 


Diff: https://reviews.apache.org/r/65130/diff/8/

Changes: https://reviews.apache.org/r/65130/diff/7-8/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 65130: HIVE-18350 : load data should rename files consistent with insert statements

2018-02-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65130/
---

(Updated Feb. 2, 2018, 12:24 a.m.)


Review request for hive, Eugene Koifman, Gopal V, Jason Dere, and Thejas Nair.


Changes
---

Implemented review comments by Jason.
Made the two fields added to Table struct optional and added them at the end.


Bugs: HIVE-18350
https://issues.apache.org/jira/browse/HIVE-18350


Repository: hive-git


Description
---

Made changes for both bucketed and non-bucketed tables.
Added a positive test for non-bucketed table which renames the loaded file.
Added couple of negative tests for bucketed table which reject a load with 
inconsistent file name.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
26afe90faa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java 
ef5e7edcd6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 9885038588 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 9b0ffe0e91 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
dc698c8de8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 69d9f3125a 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
 bacc44482a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java 9621c3be53 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q e5fdcb57e4 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q abf09e5534 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q b85c4a7aa3 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q bd780861e3 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out b9c2e6f827 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 5cfc35aa73 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 0d586fd26b 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 45704d1253 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 1959075912 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out 
054b0d00be 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_4.q.out 
95d329862c 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_5.q.out 
e711715aa5 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_7.q.out 
53c685cb11 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 
8cfa113794 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 
fce5e0cfc4 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 
8250eca099 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out 
eb813c1734 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h df646a7d17 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
27f8c0f2fc 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 f317b0393f 
  standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 6878ee1be7 
  standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
25e9a889b2 
  standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 3a11a0582a 
  standalone-metastore/src/main/thrift/hive_metastore.thrift 93f3e53de2 


Diff: https://reviews.apache.org/r/65130/diff/7/

Changes: https://reviews.apache.org/r/65130/diff/6-7/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-18607) HBase HFile write does strange things

2018-02-01 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-18607:
---

 Summary: HBase HFile write does strange things
 Key: HIVE-18607
 URL: https://issues.apache.org/jira/browse/HIVE-18607
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


I cannot get the HBaseCliDriver to run locally, so first I'll use HiveQA to 
check smth.
There's some strange code in the output handler that changes output directory 
into a file because Hive supposedly wants that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65130: HIVE-18350 : load data should rename files consistent with insert statements

2018-02-01 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65130/#review196659
---




ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java
Lines 557 (patched)


Log which method was used for the path-to-bucket mapping



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java
Lines 87 (patched)


Might need to set inputToBucketMap = null in the case that sz == 0, if you 
are overwriting an existing instance of CustomVertexConfiguration.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java
Lines 581 (patched)


Use PreConditions, assert may be no-op.


- Jason Dere


On Feb. 1, 2018, 11:29 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65130/
> ---
> 
> (Updated Feb. 1, 2018, 11:29 p.m.)
> 
> 
> Review request for hive, Eugene Koifman, Gopal V, Jason Dere, and Thejas Nair.
> 
> 
> Bugs: HIVE-18350
> https://issues.apache.org/jira/browse/HIVE-18350
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Made changes for both bucketed and non-bucketed tables.
> Added a positive test for non-bucketed table which renames the loaded file.
> Added couple of negative tests for bucketed table which reject a load with 
> inconsistent file name.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHCatUtil.java 
> 91aa4fa269 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java
>  9614114083 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestReplChangeManager.java
>  6ade76d0c2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
> 26afe90faa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java 
> ef5e7edcd6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 9885038588 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 9b0ffe0e91 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
> dc698c8de8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
>  69d9f3125a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
>  bacc44482a 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java 9621c3be53 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q e5fdcb57e4 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q abf09e5534 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q b85c4a7aa3 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q bd780861e3 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 
> b9c2e6f827 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 5cfc35aa73 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 0d586fd26b 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 45704d1253 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 1959075912 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out 
> 054b0d00be 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_4.q.out 
> 95d329862c 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_5.q.out 
> e711715aa5 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_7.q.out 
> 53c685cb11 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 
> 8cfa113794 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 
> fce5e0cfc4 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 
> 8250eca099 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out 
> eb813c1734 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> df646a7d17 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 27f8c0f2fc 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
>  f317b0393f 
>   standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 6878ee1be7 
>   standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
> 25e9a889b2 
>   standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 
> 3a11a0582a 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  b3d99a1da5 
>   
> 

Re: Review Request 65304: HIVE-18513 Query results caching

2018-02-01 Thread Jason Dere


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3699 (patched)
> > 
> >
> > Is this the permission for each results directory? Does this mean that 
> > results cannot be shared by different users?
> > 
> > Why does this need to be configurable? (I would assume this is not 
> > something that you let the user decide).
> 
> Jason Dere wrote:
> If this is HiveServer2 (with doAs=false), the hive user would be the 
> directory owner regardless of which user is submitting the query, so I would 
> not expect issues with sharing the cache in the HiveServer2 case. One danger 
> with making this directory readable by others is the possibility that it 
> allows a user to access cached results which the user may not have permission 
> to see, if user-level filtering/masking rules are enabled. Different 
> instances of Hive do not share caches, so sharing results between different 
> Hive CLI instances is not something I am worrying about here.
> 
> I was basing the directory creation rules on the scratchdir logic, which 
> does allow this to be configurable. But really the setting at the time of 
> cache initialization is what matters. If you think this should just be 
> hardcoded to 700 let me know and I can make the change.
> 
> Jesús Camacho Rodríguez wrote:
> We can remove the configuration flag indeed, I do not think it should be 
> configurable from Hive via conf.

Ok, will remove the config flag for cache dir permissions and hardcode to 700.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 1829 (patched)
> > 
> >
> > Currently what happens when we get this exception?
> > 
> > It seems to me that mechanism in HIVE-17626 works when execution itself 
> > fails, but in this case, whether the cache entry is still valid or not can 
> > be inferred statically at planning time, hence I am not sure whether it 
> > should be handled the same way? It seems we will have certain overhead that 
> > might not be necessary.
> > 
> > Can't we check the validity of the entry when we are replacing the plan 
> > by the scan on the cached results, e.g., in SemanticAnalyzer?
> 
> Jason Dere wrote:
> So the table locks for the query are not acquired until query execution, 
> which occurs after query compilation. If we implement automatic invalidation 
> of the cache based on updates to the Hive tables, the following could happen:
> 
> 1. Query A goes through query compilation and finds an entry in the cache 
> that can be used to satisfy the query. At this point there have been no 
> updates to the table which would invalidate the cache.
> 2. Query B acquires a write lock and begins making updates to one or more 
> of the tables involved in query A
> 3. Query A attempts to acquire read locks and blocks while query B is 
> running.
> 4. Query B finishes updating the tables and releases its lock.
> 5. Query A now acquires the read lock, but at this point the cached 
> result is stale.
> 
> The options here would be to either attempt to recompile the query (the 
> current approach), or to just go ahead and serve the stale results.
> 
> Jesús Camacho Rodríguez wrote:
> Should query A get a read lock on the original source tables or only on 
> the cached results? AFAIK write lock is only acquired when we make the move 
> of the results to the final directory. Hence, we do not actually get any 
> proper "transactional" guarantees with these locks. Hive is also moving 
> towards making ACID/MM the default (I think this has already happened in 
> master) which do not rely on this type of locking. Hence, I think we can 
> remove this.

ok, will make changes here.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
> > Lines 174 (patched)
> > 
> >
> > What happens if execution fails? Will the results still be cleaned 
> > properly?
> 
> Jason Dere wrote:
> This is being called in Driver.closeInProcess()/Driver.close(), so my 
> impression was this should be the case.
> 
> Jesús Camacho Rodríguez wrote:
> Makes sense. What if HS2 fails or closes? Should we add it to the 
> shutdown hook so it cleans it even if readers number is greater than zero? 
> (Not sure about the order in which things happen in this case, just want to 
> make sure we do not leave any residual data/dirs).

The QueryResultsCache initialization calls deleteOnExit() for the the results 
cache directory, so all subdirectories of this directory (representing cached 
results) should be deleted on 

Re: Review Request 65130: HIVE-18350 : load data should rename files consistent with insert statements

2018-02-01 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65130/#review196654
---




standalone-metastore/src/main/thrift/hive_metastore.thrift
Line 327 (original), 327 (patched)


I'm pretty sure this was done incorrectly - you need to leave the existing 
field definitions (identifier and field name) unchanged, and the new fields you 
add should have been assigned IDs 17, 18. The thrift changes you currently have 
will break backwards compatibility with previous versions of hive because the 
fields specified by identifiers 13-16 will have changed from what they used to 
be.

Also, do you need to add new fields, or could have have simply added 
bucketingVersion/expertMode as part of the tableProperties map (the parameters 
field in this thrift definition)?


- Jason Dere


On Feb. 1, 2018, 8:49 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65130/
> ---
> 
> (Updated Feb. 1, 2018, 8:49 p.m.)
> 
> 
> Review request for hive, Eugene Koifman, Gopal V, and Jason Dere.
> 
> 
> Bugs: HIVE-18350
> https://issues.apache.org/jira/browse/HIVE-18350
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Made changes for both bucketed and non-bucketed tables.
> Added a positive test for non-bucketed table which renames the loaded file.
> Added couple of negative tests for bucketed table which reject a load with 
> inconsistent file name.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHCatUtil.java 
> 91aa4fa269 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java
>  9614114083 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestReplChangeManager.java
>  6ade76d0c2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
> 26afe90faa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java 
> ef5e7edcd6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 9885038588 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 9b0ffe0e91 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
> dc698c8de8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
>  69d9f3125a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
>  bacc44482a 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java 9621c3be53 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q e5fdcb57e4 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q abf09e5534 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q b85c4a7aa3 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q bd780861e3 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 
> b9c2e6f827 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 5cfc35aa73 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 0d586fd26b 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 45704d1253 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 1959075912 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out 
> 054b0d00be 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_4.q.out 
> 95d329862c 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_5.q.out 
> e711715aa5 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_7.q.out 
> 53c685cb11 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 
> 8cfa113794 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 
> fce5e0cfc4 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 
> 8250eca099 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out 
> eb813c1734 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> df646a7d17 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 27f8c0f2fc 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
>  f317b0393f 
>   standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 6878ee1be7 
>   standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
> 25e9a889b2 
>   standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 
> 3a11a0582a 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  

Re: Review Request 65130: HIVE-18350 : load data should rename files consistent with insert statements

2018-02-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65130/
---

(Updated Feb. 1, 2018, 8:49 p.m.)


Review request for hive, Eugene Koifman, Gopal V, and Jason Dere.


Changes
---

Rebased.
Updated results for bucket_mapjoin_mismatch1.q


Bugs: HIVE-18350
https://issues.apache.org/jira/browse/HIVE-18350


Repository: hive-git


Description
---

Made changes for both bucketed and non-bucketed tables.
Added a positive test for non-bucketed table which renames the loaded file.
Added couple of negative tests for bucketed table which reject a load with 
inconsistent file name.


Diffs (updated)
-

  hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHCatUtil.java 
91aa4fa269 
  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java
 9614114083 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestReplChangeManager.java
 6ade76d0c2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
26afe90faa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java 
ef5e7edcd6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 9885038588 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 9b0ffe0e91 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
dc698c8de8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 69d9f3125a 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
 bacc44482a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java 9621c3be53 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q e5fdcb57e4 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q abf09e5534 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q b85c4a7aa3 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q bd780861e3 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out b9c2e6f827 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 5cfc35aa73 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 0d586fd26b 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 45704d1253 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 1959075912 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out 
054b0d00be 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_4.q.out 
95d329862c 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_5.q.out 
e711715aa5 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_7.q.out 
53c685cb11 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 
8cfa113794 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 
fce5e0cfc4 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 
8250eca099 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out 
eb813c1734 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h df646a7d17 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
27f8c0f2fc 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 f317b0393f 
  standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 6878ee1be7 
  standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
25e9a889b2 
  standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 3a11a0582a 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 b3d99a1da5 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/client/builder/TableBuilder.java
 69acf3cfff 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/MTable.java
 6c40ae8753 
  standalone-metastore/src/main/thrift/hive_metastore.thrift 93f3e53de2 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStorePartitionSpecs.java
 57e5a4126e 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 372dee6369 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestOldSchema.java
 6a44833a67 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 b9a8f61c69 


Diff: https://reviews.apache.org/r/65130/diff/6/

Changes: https://reviews.apache.org/r/65130/diff/5-6/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 65304: HIVE-18513 Query results caching

2018-02-01 Thread Jesús Camacho Rodríguez


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3699 (patched)
> > 
> >
> > Is this the permission for each results directory? Does this mean that 
> > results cannot be shared by different users?
> > 
> > Why does this need to be configurable? (I would assume this is not 
> > something that you let the user decide).
> 
> Jason Dere wrote:
> If this is HiveServer2 (with doAs=false), the hive user would be the 
> directory owner regardless of which user is submitting the query, so I would 
> not expect issues with sharing the cache in the HiveServer2 case. One danger 
> with making this directory readable by others is the possibility that it 
> allows a user to access cached results which the user may not have permission 
> to see, if user-level filtering/masking rules are enabled. Different 
> instances of Hive do not share caches, so sharing results between different 
> Hive CLI instances is not something I am worrying about here.
> 
> I was basing the directory creation rules on the scratchdir logic, which 
> does allow this to be configurable. But really the setting at the time of 
> cache initialization is what matters. If you think this should just be 
> hardcoded to 700 let me know and I can make the change.

We can remove the configuration flag indeed, I do not think it should be 
configurable from Hive via conf.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 1829 (patched)
> > 
> >
> > Currently what happens when we get this exception?
> > 
> > It seems to me that mechanism in HIVE-17626 works when execution itself 
> > fails, but in this case, whether the cache entry is still valid or not can 
> > be inferred statically at planning time, hence I am not sure whether it 
> > should be handled the same way? It seems we will have certain overhead that 
> > might not be necessary.
> > 
> > Can't we check the validity of the entry when we are replacing the plan 
> > by the scan on the cached results, e.g., in SemanticAnalyzer?
> 
> Jason Dere wrote:
> So the table locks for the query are not acquired until query execution, 
> which occurs after query compilation. If we implement automatic invalidation 
> of the cache based on updates to the Hive tables, the following could happen:
> 
> 1. Query A goes through query compilation and finds an entry in the cache 
> that can be used to satisfy the query. At this point there have been no 
> updates to the table which would invalidate the cache.
> 2. Query B acquires a write lock and begins making updates to one or more 
> of the tables involved in query A
> 3. Query A attempts to acquire read locks and blocks while query B is 
> running.
> 4. Query B finishes updating the tables and releases its lock.
> 5. Query A now acquires the read lock, but at this point the cached 
> result is stale.
> 
> The options here would be to either attempt to recompile the query (the 
> current approach), or to just go ahead and serve the stale results.

Should query A get a read lock on the original source tables or only on the 
cached results? AFAIK write lock is only acquired when we make the move of the 
results to the final directory. Hence, we do not actually get any proper 
"transactional" guarantees with these locks. Hive is also moving towards making 
ACID/MM the default (I think this has already happened in master) which do not 
rely on this type of locking. Hence, I think we can remove this.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
> > Lines 174 (patched)
> > 
> >
> > What happens if execution fails? Will the results still be cleaned 
> > properly?
> 
> Jason Dere wrote:
> This is being called in Driver.closeInProcess()/Driver.close(), so my 
> impression was this should be the case.

Makes sense. What if HS2 fails or closes? Should we add it to the shutdown hook 
so it cleans it even if readers number is greater than zero? (Not sure about 
the order in which things happen in this case, just want to make sure we do not 
leave any residual data/dirs).


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65304/#review196440
---


On Jan. 31, 2018, 6:16 a.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> 

[jira] [Created] (HIVE-18606) CTAS on empty table throws NPE from org.apache.hadoop.hive.ql.exec.MoveTask

2018-02-01 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-18606:
-

 Summary: CTAS on empty table throws NPE from 
org.apache.hadoop.hive.ql.exec.MoveTask
 Key: HIVE-18606
 URL: https://issues.apache.org/jira/browse/HIVE-18606
 Project: Hive
  Issue Type: Bug
Reporter: Eugene Koifman


{noformat}
@Test
public void testCtasEmpty() throws Exception {
  MetastoreConf.setBoolVar(hiveConf, 
MetastoreConf.ConfVars.CREATE_TABLES_AS_ACID, true);
  runStatementOnDriver("create table myctas stored as ORC as" +
  " select a, b from " + Table.NONACIDORCTBL);
  List rs = runStatementOnDriver("select ROW__ID, a, b, 
INPUT__FILE__NAME" +
  " from myctas order by ROW__ID");
}
{noformat}
{noformat}
2018-02-01T19:08:52,813 INFO  [HiveServer2-Background-Pool: Thread-463]: 
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(822)) - 114: Done cleaning 
up thread local RawStore
2018-02-01T19:08:52,813 INFO  [HiveServer2-Background-Pool: Thread-463]: 
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(305)) - ugi=hive 
ip=unknown-ip-addr  cmd=Done cleaning up thread local RawStore
2018-02-01T19:08:52,815 ERROR [HiveServer2-Background-Pool: Thread-463]: 
exec.Task (SessionState.java:printError(1228)) - Failed with exception null
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.Hive.moveAcidFiles(Hive.java:3816)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:298)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2267)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1919)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1651)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1395)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1388)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:253)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:92)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:345)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:358)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

2018-02-01T19:08:52,815 ERROR [HiveServer2-Background-Pool: Thread-463]: 
ql.Driver (SessionState.java:printError(1228)) - FAILED: Execution Error, 
return code 1 from {noformat}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65130: HIVE-18350 : load data should rename files consistent with insert statements

2018-02-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65130/
---

(Updated Feb. 1, 2018, 7:24 p.m.)


Review request for hive, Eugene Koifman, Gopal V, and Jason Dere.


Changes
---

Added two new fields in Table metadata.
1. bucketingVersion : Default is 2. 1 for older tables which use JAVA hash. 2 
for new tables which will use murmur hash. The plumbing is done. The code to 
actually change the hashing logic is yet to be done.
2. expertMode : Default false. If a user loads data into bucketed table which 
does not launch a Tez job(pending work), this is set to true. helps in 
debugging issues for wrong results in queries with bucketed tables.

Load data on bucketed tables can only take names like 00_0, 01_0 etc 
for the file names. It will reject the load otherwise.
Fixed CustomPartitionVertex code for SMB and Bucket map joins. The logic to 
iterate and assign bucket id is replaced by examining the file name and 
assigning bucket id.
For SMB, small table must have less than or equal number of buckets as big 
table.


Bugs: HIVE-18350
https://issues.apache.org/jira/browse/HIVE-18350


Repository: hive-git


Description
---

Made changes for both bucketed and non-bucketed tables.
Added a positive test for non-bucketed table which renames the loaded file.
Added couple of negative tests for bucketed table which reject a load with 
inconsistent file name.


Diffs (updated)
-

  hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHCatUtil.java 
91aa4fa269 
  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java
 9614114083 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestReplChangeManager.java
 6ade76d0c2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
26afe90faa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java 
ef5e7edcd6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 9885038588 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 9b0ffe0e91 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
dc698c8de8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 69d9f3125a 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
 bacc44482a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java 9621c3be53 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q e5fdcb57e4 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q abf09e5534 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q b85c4a7aa3 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q bd780861e3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 5cfc35aa73 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 0d586fd26b 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 45704d1253 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 1959075912 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out 
054b0d00be 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_4.q.out 
95d329862c 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_5.q.out 
e711715aa5 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_7.q.out 
53c685cb11 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 
8cfa113794 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 
fce5e0cfc4 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 
8250eca099 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out 
eb813c1734 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h df646a7d17 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
27f8c0f2fc 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 f317b0393f 
  standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 6878ee1be7 
  standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
25e9a889b2 
  standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 3a11a0582a 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 b3d99a1da5 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/client/builder/TableBuilder.java
 69acf3cfff 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/MTable.java
 6c40ae8753 
  standalone-metastore/src/main/thrift/hive_metastore.thrift 93f3e53de2 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStorePartitionSpecs.java
 57e5a4126e 
  

[jira] [Created] (HIVE-18605) text and int compares can produce surprising results

2018-02-01 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-18605:
---

 Summary: text and int compares can produce surprising results
 Key: HIVE-18605
 URL: https://issues.apache.org/jira/browse/HIVE-18605
 Project: Hive
  Issue Type: Bug
Reporter: Ankita Kapratwar


{noformat}
create table foo (i int) STORED AS. . . . . . . . . . . . . . . . . . . . . . 
.>   INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'. . . . . . . . . . 
. . . . . . . . . . . . .>   OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat';

> insert into foo values (1);
> create table foo (i int) STORED AS. . . . . . . . . . . . . . . . . . . . . . 
> .>   INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'. . . . . . . . . 
> . . . . . . . . . . . . . .>   OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat';

> insert into foo values (1);
> select * from foo where i != 'foo';
++
| foo.i  |
++
++
No rows selected (0.562 seconds)
> select * from foo where i != '2';
++
| foo.i  |
++
| 1  |
++
{noformat}

This can be worked around with an explicit cast to string.
Double is used as a common type according to [~ashutoshc].
Seems like if this is the case, there should be a failure when converting foo 
to double.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65276: HIVE-18516

2018-02-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65276/
---

(Updated Feb. 1, 2018, 7:10 p.m.)


Review request for hive, Eugene Koifman and Jason Dere.


Changes
---

Updated with merge conflicts due to another commit in Hive.java


Bugs: HIVE-18516
https://issues.apache.org/jira/browse/HIVE-18516


Repository: hive-git


Description
---

load data should rename files consistent with insert statements for ACID Tables.
Includes test change for a missed test.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties d86ff58840 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 3b97dac8ca 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java a9cba456ef 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveCopyFiles.java 
c6a4a8926b 
  ql/src/test/queries/clientnegative/load_data_into_acid.q 2ac5b561ae 
  ql/src/test/queries/clientpositive/load_data_acid_rename.q PRE-CREATION 
  ql/src/test/queries/clientpositive/smb_mapjoin_7.q 4a6afb0496 
  ql/src/test/results/clientnegative/load_data_into_acid.q.out 46b5cdd2c8 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out 7a6f8c53a5 
  ql/src/test/results/clientpositive/llap/load_data_acid_rename.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/smb_mapjoin_7.q.out b71c5b87c1 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out ac49c02913 


Diff: https://reviews.apache.org/r/65276/diff/7/

Changes: https://reviews.apache.org/r/65276/diff/6-7/


Testing
---


Thanks,

Deepak Jaiswal



Re: Build bot comments in Jira

2018-02-01 Thread Eugene Koifman
This is great.   Thank you!

On 2/1/18, 3:18 AM, "Adam Szita"  wrote:

Hi,

I can recommend using this Chrome extension:
https://github.com/gezapeti/jira-comment-collapser. It was developed by a
colleague of ours who's working on Oozie.
It can also be found in the store:

https://chrome.google.com/webstore/search/jira%20comment%20collapser%20gezapeti?hl=en-GB_source=chrome-ntp-launcher

We just have to extend its user list with "hiveqa".

Prasanth: I'm not sure what you mean by Yetus deduplicating patch
submissions. AFAIK Yetus integration introduced no change relating to the
triggering of these jobs.

Thanks,
Adam

On 31 January 2018 at 21:32, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> Also Yetus doesn't look like its deduplicating patch submissions. Earlier
> only the latest patch submission will run ptest but now all submissions 
are
> running ptest even if the jira is closed.
>
> Thanks
> Prasanth
>
>
>
> On Wed, Jan 31, 2018 at 12:09 PM -0800, "Vineet Garg" <
> vg...@hortonworks.com> wrote:
>
>
> I would like to know this too. With checkstyle/yetus comments it has
> become very noisy.
>
> > On Jan 31, 2018, at 10:04 AM, Eugene Koifman  wrote:
> >
> > Hi,
> > Is there way to automatically collapse the comments from build bot in
> the Jiras?  Or some config change to make that happen?
> > Maybe they can be relegated to a separate tab.
> >
> > For tickets that go through multiple patches, these comments obscure the
> discussion comments.
> >
> >
> > Thanks,
> > Eugene
>
>
>




Re: Yetus JDK version

2018-02-01 Thread Alan Gates
Ok, looking briefly at it, it looks like if we changed
testutils/…/TestScripts.java line 76 to set javaHome to 1.8 instead of 1.7
that we’ll be running ptest with 1.8.  I’m not familiar with ptest, but I’m
guessing that someone would need to make this change and then re-deploy
ptest in our test infrastructure.  Is there anything else we need to do?
We clearly already have 1.8 installed on the test machines because the code
compiles with 1.8, but I don’t know what the path is, etc.

Alan.

On Tue, Jan 30, 2018 at 4:58 PM, Alan Gates  wrote:

> I put code in the latest patch for HIVE-17983 that executes one of the
> compiled classes as part of the maven build.  (It does this to
> automatically generate the config template.)  This works locally and in the
> ptest build.  But in the Yetus tests it fails with:
>
> Exception in thread "main" java.lang.UnsupportedClassVersionError:
> org/apache/hadoop/hive/metastore/conf/ConfTemplatePrinter : Unsupported
> major.minor version 52.0
>
> This means that it is compiling with JDK 1.8 but running it with 1.7.  How
> do we switch the Yetus build so it runs maven with the correct JDK version?
>
> Alan.
>


Re: Review Request 65271: JDBC: Provide a way for JDBC users to pass cookie info via connection string

2018-02-01 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65271/#review196636
---




jdbc/src/java/org/apache/hive/jdbc/HttpRequestInterceptorBase.java
Line 91 (original), 96 (patched)


A nit - 
I think using "+=" is more readable for appends -

cookieHeaderKeyValues +=
   ";" + entry.getKey() + "=" + entry.getValue();


- Thejas Nair


On Jan. 31, 2018, 10:50 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65271/
> ---
> 
> (Updated Jan. 31, 2018, 10:50 p.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-18447
> https://issues.apache.org/jira/browse/HIVE-18447
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-18447
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/cli/thrift/TestThriftHttpCLIServiceFeatures.java
>  93b10fb4b4 
>   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java cb2f09cbf2 
>   jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 5d2ddb5c21 
>   jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java 
> 37862be804 
>   jdbc/src/java/org/apache/hive/jdbc/HttpRequestInterceptorBase.java 
> cf1a11ecb6 
>   jdbc/src/java/org/apache/hive/jdbc/HttpTokenAuthInterceptor.java 59a91dd14c 
>   jdbc/src/java/org/apache/hive/jdbc/Utils.java f7f3854b86 
> 
> 
> Diff: https://reviews.apache.org/r/65271/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



[jira] [Created] (HIVE-18604) DropDatabase cascade fails when there is an index in the DB

2018-02-01 Thread Adam Szita (JIRA)
Adam Szita created HIVE-18604:
-

 Summary: DropDatabase cascade fails when there is an index in the 
DB
 Key: HIVE-18604
 URL: https://issues.apache.org/jira/browse/HIVE-18604
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Adam Szita
Assignee: Adam Szita


As seen in [HMS API 
test|https://github.com/apache/hive/blob/master/standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java#L452]
 dropping database (even with cascade) is failing when an index exists in the 
corresponding database, throwing MetaException:
{code:java}
MetaException(message:Exception thrown flushing changes to datastore
)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:208)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy35.drop_table_with_environment_context(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.drop_table_with_environment_context(HiveMetaStoreClient.java:2495)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1092)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1007)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:859)
at 
org.apache.hadoop.hive.metastore.client.TestDatabases.testDropDatabaseWithIndexCascade(TestDatabases.java:470)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runners.Suite.runChild(Suite.java:127)
at org.junit.runners.Suite.runChild(Suite.java:26)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: javax.jdo.JDODataStoreException: Exception thrown flushing changes 
to datastore
NestedThrowables:
java.sql.BatchUpdateException: DELETE on table 'TBLS' caused a violation of 
foreign key constraint 'IDXS_FK1' for key (2). The statement has been rolled 
back.
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
at org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:171)
at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:745)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
at com.sun.proxy.$Proxy33.commitTransaction(Unknown Source)
at 

[jira] [Created] (HIVE-18603) Use Hash For Partition HDFS File Path

2018-02-01 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-18603:
--

 Summary: Use Hash For Partition HDFS File Path
 Key: HIVE-18603
 URL: https://issues.apache.org/jira/browse/HIVE-18603
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 2.3.0, 1.2.0, 3.0.0, 2.4.0
Reporter: BELUGA BEHR


Currently, for partitioned tables, Hive uses the literal value of each 
partition in the HDFS file path.  Instead, perhaps we can use a hash value so 
that:

 
 # The partitioned values are obscured to a casual observer in HDFS
 # Remove the chance of having a very long HDFS file name when faced with a 
very long partitioned value
 # Remove the needs to worry about special characters in the partitioned path 
name as the hash value would only be HEX string values.

 

The suggestion here is that we retain the partition values, just as is done 
now, but the default HDFS location for each partition will use the hash of the 
value instead of the value itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18602) Implement Partition By Hash Index

2018-02-01 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-18602:
--

 Summary: Implement Partition By Hash Index
 Key: HIVE-18602
 URL: https://issues.apache.org/jira/browse/HIVE-18602
 Project: Hive
  Issue Type: New Feature
Reporter: BELUGA BEHR


Borrowing the concept from MySQL.  This would also save us from having random 
column values in the HDFS partition file path since the HASH value would be hex 
and each one would be the same length.

 

https://dev.mysql.com/doc/refman/5.7/en/partitioning-hash.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


IMetaStoreClient.getPartitionsByNames is not case insensitive on names

2018-02-01 Thread Adam Szita
Hi all,

While testing the HMS API, we've found that the col name part of partition
names provided to IMetaStoreClient.getPartitionsByNames method are not
handled in a case insensitive matter.

See related test here:
https://github.com/apache/hive/blob/master/standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java#L357

Although col names are handled in a case insensitive way mostly throughout
Hive, I'm not sure if this case should be corrected or not. Gut feeling
says this should be insensitive too, but I'm afraid changing it would
introduce more additional computation than what it's worth. (By separating
the partition name, taking the lowercase of all col parts and then
recompile the whole string before storing it into HMS DB...)

What's your opinion on this?

Thanks,
Adam


Re: Build bot comments in Jira

2018-02-01 Thread Adam Szita
Hi,

I can recommend using this Chrome extension:
https://github.com/gezapeti/jira-comment-collapser. It was developed by a
colleague of ours who's working on Oozie.
It can also be found in the store:
https://chrome.google.com/webstore/search/jira%20comment%20collapser%20gezapeti?hl=en-GB_source=chrome-ntp-launcher

We just have to extend its user list with "hiveqa".

Prasanth: I'm not sure what you mean by Yetus deduplicating patch
submissions. AFAIK Yetus integration introduced no change relating to the
triggering of these jobs.

Thanks,
Adam

On 31 January 2018 at 21:32, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> Also Yetus doesn't look like its deduplicating patch submissions. Earlier
> only the latest patch submission will run ptest but now all submissions are
> running ptest even if the jira is closed.
>
> Thanks
> Prasanth
>
>
>
> On Wed, Jan 31, 2018 at 12:09 PM -0800, "Vineet Garg" <
> vg...@hortonworks.com> wrote:
>
>
> I would like to know this too. With checkstyle/yetus comments it has
> become very noisy.
>
> > On Jan 31, 2018, at 10:04 AM, Eugene Koifman  wrote:
> >
> > Hi,
> > Is there way to automatically collapse the comments from build bot in
> the Jiras?  Or some config change to make that happen?
> > Maybe they can be relegated to a separate tab.
> >
> > For tickets that go through multiple patches, these comments obscure the
> discussion comments.
> >
> >
> > Thanks,
> > Eugene
>
>
>


[jira] [Created] (HIVE-18601) Support Power platform by updating protoc-jar-maven-plugin version

2018-02-01 Thread Pravin Dsilva (JIRA)
Pravin Dsilva created HIVE-18601:


 Summary: Support Power platform by updating 
protoc-jar-maven-plugin version
 Key: HIVE-18601
 URL: https://issues.apache.org/jira/browse/HIVE-18601
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 3.0.0
 Environment: # uname -a
Linux pts00607-vm16 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:05:18 UTC 
2016 ppc64le ppc64le ppc64le GNU/Linux
 # # cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"
Reporter: Pravin Dsilva


Below is error is seen while building standalone-metastore project
{code:java}
[INFO] --- protoc-jar-maven-plugin:3.0.0-a3:run 
(default) @ hive-standalone-metastore ---
[INFO] Protoc version: 2.5.0
[INFO] Input directories:
[INFO] 
/var/lib/jenkins/workspace/hive/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore
[INFO] Output targets:
[INFO] java: 
/var/lib/jenkins/workspace/hive/standalone-metastore/target/generated-sources 
(add: none, clean: false)
[INFO] 
/var/lib/jenkins/workspace/hive/standalone-metastore/target/generated-sources 
does not exist. Creating...
[INFO] Processing (java): metastore.proto
protoc-jar: protoc version: 250, detected platform: linux/ppc64le
protoc-jar: executing: [/tmp/protoc184130581004216.exe, 
-I/var/lib/jenkins/workspace/hive/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/var/lib/jenkins/workspace/hive/standalone-metastore/target/generated-sources,
 
/var/lib/jenkins/workspace/hive/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
/tmp/protoc184130581004216.exe: 1: /tmp/protoc184130581004216.exe: 
ELF: not found
/tmp/protoc184130581004216.exe: 1: /tmp/protoc184130581004216.exe: Cȁ: 
not found
/tmp/protoc184130581004216.exe: 2: /tmp/protoc184130581004216.exe: �: 
not found
/tmp/protoc184130581004216.exe: 3: /tmp/protoc184130581004216.exe: 
��_�c���jnP���R���?��Y@�9��Ch��߳yIk��: not found
/tmp/protoc184130581004216.exe: 2: /tmp/protoc184130581004216.exe: 
Syntax error: Unterminated quoted string{code}
 

The protoc-jar-maven-plugin version used is 3.0.0-a3 whereas Power (ppc64le) 
support was added in 3.5.1.1. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18600) Vectorization: Top-Level Vector Expression Scratch Column Deallocation

2018-02-01 Thread Matt McCline (JIRA)
Matt McCline created HIVE-18600:
---

 Summary: Vectorization: Top-Level Vector Expression Scratch Column 
Deallocation
 Key: HIVE-18600
 URL: https://issues.apache.org/jira/browse/HIVE-18600
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline
 Fix For: 3.0.0


The operators create various vector expression *arrays* for predicates, SELECT 
clauses, key expressions, etc.  We could have those be marked as special "top 
level" vector expression then we could defer deallocation until the top level 
expression is complete.  This could be a simple solution that avoids trying fix 
our current eager deallocation that tries to reuse scratch columns as soon as 
possible.  It *isn't optimal*, but it *shouldn't be too bad*. This solution is 
much better than not deallocating at all - especially for queries that SELECT a 
large number of columns or have a lot of expressions in the operator tree.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #304: HIVE-18581: Replication events should use lower case...

2018-02-01 Thread anishek
GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/304

HIVE-18581: Replication events should use lower case db object names



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-18581

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/304.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #304


commit 553f0de490d49b77294a70875264819b387e2a45
Author: Anishek Agarwal 
Date:   2018-01-31T10:12:31Z

HIVE-18581: Replication events should use lower case db object names




---