[jira] [Created] (HIVE-21172) DEFAULT keyword handling in MERGE UPDATE clause issues

2019-01-25 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-21172:
-

 Summary: DEFAULT keyword handling in MERGE UPDATE clause issues
 Key: HIVE-21172
 URL: https://issues.apache.org/jira/browse/HIVE-21172
 Project: Hive
  Issue Type: Sub-task
  Components: SQL, Transactions
Affects Versions: 4.0.0
Reporter: Eugene Koifman


once HIVE-21159 lands, enable {{HiveConf.MERGE_SPLIT_UPDATE}} and run these 
tests.

TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_stats]
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=insert_into_default_keyword.q

Merge is rewritten as a multi-insert.  When Update clause has DEFAULT, it's not 
properly replaced with a value in the muli-insert - it's treated as a literal
{noformat}
INSERT INTO `default`.`acidTable`-- update clause(insert part)
 SELECT `t`.`key`, `DEFAULT`, `t`.`value`
   WHERE `t`.`key` = `s`.`key` AND `s`.`key` > 3 AND NOT(`s`.`key` < 3)
{noformat}

See {{LOG.info("Going to reparse <" + originalQuery + "> as \n<" + 
rewrittenQueryStr.toString() + ">");}} in hive.log

{{MergeSemanticAnalyzer.replaceDefaultKeywordForMerge()}} is only called in 
{{handleInsert}} but not {{handleUpdate()}}.  Why does issue only show up with 
{{MERGE_SPLIT_UPDATE}}?




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-25 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 26, 2019, 12:32 a.m.)


Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b3a475478d 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
d6a41919bf 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e7aa041c25 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
fbb931cbcd 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 6d4578e7a0 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
db3b427adc 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
dc05e1990e 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java a0df82cb20 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out c9716e904c 


Diff: https://reviews.apache.org/r/69367/diff/8/

Changes: https://reviews.apache.org/r/69367/diff/7-8/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-25 Thread Vaibhav Gumashta


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
> > Lines 196 (patched)
> > 
> >
> > I don't understand the logic here.  Since major compaction was done 
> > above, there should only be base/bucket0 and base/bucket1 so there is 
> > nothing for this query to group.  Also, I would think SPLIT_GROUPING_MODE 
> > should be "query" here...  if it's not, where is it set?

Sorry the comment is misleading (what I wanted to convey was what you wrote 
above) - removed. We set SPLIT_GROUPING_MODE inside CompactorMR, if 
COMPACTOR_CRUD_QUERY_BASED is set to true. I've modified runCompaction, 
runInitiator, runWorker methods to create a new HiveConf object and set 
COMPACTOR_CRUD_QUERY_BASED to true in that one.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
> > Lines 255 (patched)
> > 
> >
> > nit: this could just do ShowCompactions to see if anything got queued up

I think you are right, this is redundant as I'm already checking for compaction 
queue above.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
> > Lines 309 (patched)
> > 
> >
> > How does (3,3,x) end up in bucket0?  with bucketing_version=1 it should 
> > be (val mod num_buckets)=bucketId.

Filed https://issues.apache.org/jira/browse/HIVE-21167. For the test cases in 
this jira, will use bucketing_version=2.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
> > Lines 312 (patched)
> > 
> >
> > And smimilarly, (4,4) is in bucket1...

Filed https://issues.apache.org/jira/browse/HIVE-21167. For the test cases in 
this jira, will use bucketing_version=2.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
> > Lines 315 (patched)
> > 
> >
> > since you just ran a major compaction, there is only 1 file per bucket 
> > so would split grouper do anything?  would there be > 1 split?

Sorry, my comment was misleading - removed.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
> > Lines 189 (patched)
> > 
> >
> > What is this for?  It seems fragile since it forces some behavior on 
> > all tests.  Do any newly added tests rely on this?

This is a bug that I'm masking for purposes of test (similar to the bug 
reported here: https://issues.apache.org/jira/browse/HIVE-12957). Basically on 
my laptop Tez was not estimating taskResource (taskResource = 
getContext().getVertexTaskResource().getMemory();) correctly - coming up with a 
-ve value, which would cause it to throw Illegal Capacity exception. I think 
taskResource should never return -ve value.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java
> > Line 1233 (original)
> > 
> >
> > Seems that now the class level JavaDoc is out of sync

You mean the overall class doc for OrcRawRecordMerger needs an update?


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java
> > Lines 637 (patched)
> > 
> >
> > What throws the IAE?  Above I see
> > if (!reader.hasMetadataValue(OrcRecordUpdater.ACID_KEY_INDEX_NAME)) {
> > 
> > shouldn't it bail out there if there is no index?

I don't know why I was seeing this earlier - don't see it in my runs now. 
Removed.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java
> > Lines 638 (patched)
> > 
> >
> > is there a followup Jira for this?

https://jira.apache.org/jira/browse/HIVE-21165


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 248 (patched)
> > 
> >
> > What does 

[jira] [Created] (HIVE-21171) Skip creating scratch dirs for tez if RPC is on

2019-01-25 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-21171:
--

 Summary: Skip creating scratch dirs for tez if RPC is on
 Key: HIVE-21171
 URL: https://issues.apache.org/jira/browse/HIVE-21171
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Reporter: Vineet Garg
Assignee: Vineet Garg


There are few places e.g. during creating DAG/Vertices where scratch 
directories are created for each vertex even if plan is being sent using RPC. 
This adds un-necessary overhead for cloud file system e.g. S3A.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21170) Wrong (no) results of cross-product query executed on LLAP

2019-01-25 Thread Krzysztof Zarzycki (JIRA)
Krzysztof Zarzycki created HIVE-21170:
-

 Summary: Wrong (no) results of cross-product query executed on 
LLAP 
 Key: HIVE-21170
 URL: https://issues.apache.org/jira/browse/HIVE-21170
 Project: Hive
  Issue Type: Bug
 Environment: Hive distribution: HDP 3.1.0. 

LLAP execution engine in version:
{code:java}
$ beeline --version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive 3.1.0.3.1.0.0-78
Git 
git://ctr-e138-1518143905142-586755-01-15.hwx.site/grid/0/jenkins/workspace/HDP-parallel-centos7/SOURCES/hive
 -r 56673b027117d8cb3400675b1680a4d992360808
Compiled by jenkins on Thu Dec 6 12:27:21 UTC 2018
>From source with checksum 97cc61f6acbe68b1fa988aa9f76b34cc{code}
 
Reporter: Krzysztof Zarzycki


*In the environment:* \{code}
$ beeline --version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive 3.1.0.3.1.0.0-78
Git 
git://ctr-e138-1518143905142-586755-01-15.hwx.site/grid/0/jenkins/workspace/HDP-parallel-centos7/SOURCES/hive
 -r 56673b027117d8cb3400675b1680a4d992360808
Compiled by jenkins on Thu Dec 6 12:27:21 UTC 2018
>From source with checksum 97cc61f6acbe68b1fa988aa9f76b34cc
{code}

*On LLAP execution engine*, the following query gives *wrong results*:
{code}
-- prepare test data
set hive.query.results.cache.enabled=false;
create table test1 (id int);
insert into test1 values (1),(2),(3);

-- query
select * from test1 t1 cross join test1 t2;
{code}

*Query result:*
{code}
0: jdbc:hive2://hostname:> select t1.* from test1 t1 cross join test1 t2;
INFO : Compiling 
command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6): 
select t1.* from test1 t1 cross join test1 t2
INFO : Warning: Shuffle Join MERGEJOIN[9][tables = [$hdt$_0, $hdt$_1]] in Stage 
'Reducer 2' is a cross product
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:t1.id, 
type:int, comment:null)], properties:null)
INFO : Completed compiling 
command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6); Time 
taken: 0.229 seconds
INFO : Executing 
command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6): 
select t1.* from test1 t1 cross join test1 t2
INFO : Query ID = hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6
INFO : Total jobs = 1
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-1:MAPRED] in parallel
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. llap SUCCEEDED 1 1 0 0 0 0
Map 3 .. llap SUCCEEDED 1 1 0 0 0 0
Reducer 2 llap SUCCEEDED 0 0 0 0 0 0
--
VERTICES: 02/03 [==>>] 100% ELAPSED TIME: 1.92 s
--
INFO : Completed executing 
command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6); Time 
taken: 2.006 seconds
INFO : OK
++
| t1.id |
++
++
No rows selected (2.284 seconds)
{code}

*Expected result:*
{code}
++
| t1.id |
++
| 3 |
| 3 |
| 3 |
| 2 |
| 2 |
| 2 |
| 1 |
| 1 |
| 1 |
++
9 rows selected
{code}

*What worked as a workaround:*
{\{set hive.tez.cartesian-product.enabled=false;}} (default true)
Then query gave a correct result.

*Difference in execution plans :*
1. With \{{set hive.tez.cartesian-product.enabled=true;}}: 
{code}
++
| Explain |
++
| Plan optimized by CBO. |
| |
| Vertex dependency in root stage |
| Reducer 2 <- Map 1 (XPROD_EDGE), Map 3 (XPROD_EDGE) |
| |
| Stage-0 |
| Fetch Operator |
| limit:-1 |
| Stage-1 |
| Reducer 2 llap |
| File Output Operator [FS_8] |
| Merge Join Operator [MERGEJOIN_9] (rows=9 

[jira] [Created] (HIVE-21169) Add StatementExecuteTime to bone cp metrics

2019-01-25 Thread Karthik Manamcheri (JIRA)
Karthik Manamcheri created HIVE-21169:
-

 Summary: Add StatementExecuteTime to bone cp metrics
 Key: HIVE-21169
 URL: https://issues.apache.org/jira/browse/HIVE-21169
 Project: Hive
  Issue Type: Improvement
  Components: Hive, Standalone Metastore
Reporter: Karthik Manamcheri


HIVE-21045 added connection pool metrics from BoneCp and HikariCp. BoneCp also 
as this metric called "StatementExecuteTime". [~ngangam] suggested that we see 
if there is value in exposing "StatementExecuteTimeAvg" as a metric. This could 
be a good indicator to see if the HMS DB is performant.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69834: HIVE-21083: Removed the truststore location property requirement and removed the warnings on the truststore password property

2019-01-25 Thread Morio Ramdenbourg via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69834/
---

(Updated Jan. 25, 2019, 7:59 p.m.)


Review request for hive, Adam Holley, Karthik Manamcheri, Na Li, and Vihang 
Karajgaonkar.


Changes
---

Made a few phrasing changes based on Vihang's feedback


Bugs: HIVE-21083
https://issues.apache.org/jira/browse/HIVE-21083


Repository: hive-git


Description (updated)
---

It was identified that a valid way of configuring TLS is by using the Java 
default truststore and directly adding the trusted certificates to it. The 
previous HMS implementation did not support this.

Modified the TLS properties in the following ways:
- Removed the requirement for metastore.dbaccess.ssl.truststore.path. If the 
user does not specify a custom one, then it will default to the Java truststore.
- Removed the logs / warnings on metastore.dbaccess.ssl.truststore.password. 
This used to generate a lot of noise if the user did not provide one. Also, the 
contents of the truststore is certificates, which is public information and 
doesn't require strict security.
- Removed the unit test that checks for an empty truststore path.


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 75f0c0a356f3b894408aa54b9cce5220d47d7f26 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 9f721243c94d48eef35acdcbd0c2e143ab6d23ec 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 29738ba19b0d5ed9ec224d2288c0c1c922d0674c 


Diff: https://reviews.apache.org/r/69834/diff/4/

Changes: https://reviews.apache.org/r/69834/diff/3-4/


Testing
---

- Existing unit test coverage
- Manual testing by verifying that these properties can configure TLS to a 
MySQL DB


Thanks,

Morio Ramdenbourg



Re: Review Request 69834: HIVE-21083: Removed the truststore location property requirement and removed the warnings on the truststore password property

2019-01-25 Thread Morio Ramdenbourg via Review Board


> On Jan. 25, 2019, 7:41 p.m., Vihang Karajgaonkar wrote:
> > Some minor comments related to logs. Rest looks good.

Thanks for the feedback


> On Jan. 25, 2019, 7:41 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
> > Lines 472 (patched)
> > 
> >
> > Nit, It think it would be useful to say "Defaults to jssecacerts, if it 
> > exists, otherwise uses cacerts"

Done


> On Jan. 25, 2019, 7:41 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Lines 362 (patched)
> > 
> >
> > May be its useful to specify what is the default. So a message like ".. 
> > has not been set. Defaulting to jssecacerts, if it exists. Otherwise, 
> > cacerts."

Done


> On Jan. 25, 2019, 7:41 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Line 369 (original), 371 (patched)
> > 
> >
> > nit, instead of defaulting to default .. may be just say Using default 
> > Java truststore password.

Done


- Morio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69834/#review212345
---


On Jan. 25, 2019, 7:22 p.m., Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69834/
> ---
> 
> (Updated Jan. 25, 2019, 7:22 p.m.)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Na Li, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-21083
> https://issues.apache.org/jira/browse/HIVE-21083
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It was identified that a valid way of configuring TLS is by using the Java 
> default truststore and directly adding the trusted certificates to it. The 
> previous HMS implementation did not support this.
> 
> Modified the TLS properties in the following ways:
>  - Removed the requirement for metastore.dbaccess.ssl.truststore.path. If the 
> user does not specify a custom one, then it will default to the Java 
> truststore.
>  - Removed the logs / warnings on metastore.dbaccess.ssl.truststore.password. 
> This used to generate a lot of noise if the user did not provide one. Also, 
> the contents of the truststore is certificates, which is public information 
> and doesn't require strict security.
>  - Removed the unit test that checks for an empty truststore path.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  75f0c0a356f3b894408aa54b9cce5220d47d7f26 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9f721243c94d48eef35acdcbd0c2e143ab6d23ec 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  29738ba19b0d5ed9ec224d2288c0c1c922d0674c 
> 
> 
> Diff: https://reviews.apache.org/r/69834/diff/3/
> 
> 
> Testing
> ---
> 
> - Existing unit test coverage
> - Manual testing by verifying that these properties can configure TLS to a 
> MySQL DB
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



[jira] [Created] (HIVE-21168) Fix TestSchemaToolCatalogOps

2019-01-25 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21168:
--

 Summary: Fix TestSchemaToolCatalogOps
 Key: HIVE-21168
 URL: https://issues.apache.org/jira/browse/HIVE-21168
 Project: Hive
  Issue Type: Test
Affects Versions: 3.2.0
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-21077 causes TestSchemaToolCatalogOps to fail on branch-3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69834: HIVE-21083: Removed the truststore location property requirement and removed the warnings on the truststore password property

2019-01-25 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69834/#review212345
---


Fix it, then Ship it!




Some minor comments related to logs. Rest looks good.


standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
Lines 472 (patched)


Nit, It think it would be useful to say "Defaults to jssecacerts, if it 
exists, otherwise uses cacerts"



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 362 (patched)


May be its useful to specify what is the default. So a message like ".. has 
not been set. Defaulting to jssecacerts, if it exists. Otherwise, cacerts."



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 369 (original), 371 (patched)


nit, instead of defaulting to default .. may be just say Using default Java 
truststore password.


- Vihang Karajgaonkar


On Jan. 25, 2019, 7:22 p.m., Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69834/
> ---
> 
> (Updated Jan. 25, 2019, 7:22 p.m.)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Na Li, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-21083
> https://issues.apache.org/jira/browse/HIVE-21083
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It was identified that a valid way of configuring TLS is by using the Java 
> default truststore and directly adding the trusted certificates to it. The 
> previous HMS implementation did not support this.
> 
> Modified the TLS properties in the following ways:
>  - Removed the requirement for metastore.dbaccess.ssl.truststore.path. If the 
> user does not specify a custom one, then it will default to the Java 
> truststore.
>  - Removed the logs / warnings on metastore.dbaccess.ssl.truststore.password. 
> This used to generate a lot of noise if the user did not provide one. Also, 
> the contents of the truststore is certificates, which is public information 
> and doesn't require strict security.
>  - Removed the unit test that checks for an empty truststore path.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  75f0c0a356f3b894408aa54b9cce5220d47d7f26 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9f721243c94d48eef35acdcbd0c2e143ab6d23ec 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  29738ba19b0d5ed9ec224d2288c0c1c922d0674c 
> 
> 
> Diff: https://reviews.apache.org/r/69834/diff/3/
> 
> 
> Testing
> ---
> 
> - Existing unit test coverage
> - Manual testing by verifying that these properties can configure TLS to a 
> MySQL DB
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



Re: Review Request 69834: HIVE-21083: Removed the truststore location property requirement and removed the warnings on the truststore password property

2019-01-25 Thread Morio Ramdenbourg via Review Board


> On Jan. 25, 2019, 5:08 p.m., Karthik Manamcheri wrote:
> > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
> > Line 1082 (original)
> > 
> >
> > Add a test to make sure that we don't throw an exception if the 
> > truststore path and password is empty.

Done


- Morio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69834/#review212335
---


On Jan. 25, 2019, 1:38 a.m., Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69834/
> ---
> 
> (Updated Jan. 25, 2019, 1:38 a.m.)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Na Li, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-21083
> https://issues.apache.org/jira/browse/HIVE-21083
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It was identified that a valid way of configuring TLS is by using the Java 
> default truststore and directly adding the trusted certificates to it. The 
> previous HMS implementation did not support this.
>   
> Modified the TLS properties in the following ways:
> - Removed the requirement for metastore.dbaccess.ssl.truststore.path. If the 
> user does not specify a custom one, then it will default to the Java 
> truststore.
> - Removed the logs / warnings on metastore.dbaccess.ssl.truststore.password. 
> This used to generate a lot of noise if the user did not provide one. Also, 
> the contents of the truststore is certificates, which is public information 
> and doesn't require strict security.
> - Removed the unit test that checks for an empty truststore path.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  75f0c0a356f3b894408aa54b9cce5220d47d7f26 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9f721243c94d48eef35acdcbd0c2e143ab6d23ec 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  29738ba19b0d5ed9ec224d2288c0c1c922d0674c 
> 
> 
> Diff: https://reviews.apache.org/r/69834/diff/2/
> 
> 
> Testing
> ---
> 
> - Existing unit test coverage
> - Manual testing by verifying that these properties can configure TLS to a 
> MySQL DB
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



Re: Review Request 69834: HIVE-21083: Removed the truststore location property requirement and removed the warnings on the truststore password property

2019-01-25 Thread Morio Ramdenbourg via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69834/
---

(Updated Jan. 25, 2019, 7:22 p.m.)


Review request for hive, Adam Holley, Karthik Manamcheri, Na Li, and Vihang 
Karajgaonkar.


Changes
---

Added a unit test to ensure that an empty truststore path/password does not 
throw an exception based on Karthik's feedback, and improved the comments


Bugs: HIVE-21083
https://issues.apache.org/jira/browse/HIVE-21083


Repository: hive-git


Description (updated)
---

It was identified that a valid way of configuring TLS is by using the Java 
default truststore and directly adding the trusted certificates to it. The 
previous HMS implementation did not support this.

Modified the TLS properties in the following ways:
 - Removed the requirement for metastore.dbaccess.ssl.truststore.path. If the 
user does not specify a custom one, then it will default to the Java truststore.
 - Removed the logs / warnings on metastore.dbaccess.ssl.truststore.password. 
This used to generate a lot of noise if the user did not provide one. Also, the 
contents of the truststore is certificates, which is public information and 
doesn't require strict security.
 - Removed the unit test that checks for an empty truststore path.


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 75f0c0a356f3b894408aa54b9cce5220d47d7f26 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 9f721243c94d48eef35acdcbd0c2e143ab6d23ec 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 29738ba19b0d5ed9ec224d2288c0c1c922d0674c 


Diff: https://reviews.apache.org/r/69834/diff/3/

Changes: https://reviews.apache.org/r/69834/diff/2-3/


Testing
---

- Existing unit test coverage
- Manual testing by verifying that these properties can configure TLS to a 
MySQL DB


Thanks,

Morio Ramdenbourg



[jira] [Created] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-01-25 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21167:
---

 Summary: Bucketing: Bucketing version 1 is incorrectly 
partitioning data
 Key: HIVE-21167
 URL: https://issues.apache.org/jira/browse/HIVE-21167
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


Using murmur hash for bucketing columns was introduced in HIVE-18910, following 
which {{'bucketing_version'='1'}} stands for the old behaviour (where for 
example integer columns were partitioned based on mod values). Looks like we 
have a bug in the old bucketing scheme now. I could repro it when modified the 
existing schema using an alter table add column and adding new data. Repro:

{code}
0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 (a 
int, b int) partitioned by(ds string) clustered by (a) into 2 buckets stored as 
ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
'transactional_properties'='default');

No rows affected (0.418 seconds)

0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
6 rows affected (3.695 seconds)
{code}

Data from ORC file (data as expected):
{code}
/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
{"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, 
"currentTransaction": 1, "row": {"a": 2, "b": 4}}
{"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 1, 
"currentTransaction": 1, "row": {"a": 2, "b": 3}}


/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_1
{"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 0, 
"currentTransaction": 1, "row": {"a": 1, "b": 3}}
{"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 1, 
"currentTransaction": 1, "row": {"a": 1, "b": 2}}
{code}

Modifying table schema and inserting new data:
{code}
0: jdbc:hive2://localhost:10010> alter table acid_ptn_bucket1 add columns(c 
int);

No rows affected (0.541 seconds)

0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
values(3,2,1000,'yesterday'),(3,3,1001,'today'),(3,4,1002,'yesterday'),(4,2,1003,'today'),
 (4,3,1004,'yesterday'),(4,4,1005,'today');
6 rows affected (3.699 seconds)
{code}

Data from ORC file (wrong partitioning):
{code}
/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_0
{"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
"currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}

/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_1
{"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
"currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}
{"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
"currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
{code}

As seen above, the expected behaviour is that new data with column 'a' being 3 
should go to bucket1 and column 'a' being 4 should go to bucket0, but the 
partitioning is wrong.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69642: HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-25 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69642/#review212343
---


Ship it!




Ship It!

- Peter Vary


On jan. 3, 2019, 1:40 de, Karthik Manamcheri wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69642/
> ---
> 
> (Updated jan. 3, 2019, 1:40 de)
> 
> 
> Review request for hive, Adam Holley, Na Li, Morio Ramdenbourg, Naveen 
> Gangam, Peter Vary, Sergio Pena, and Vihang Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve 
> get_partition performance
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e7 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/PreReadTableEvent.java
>  beec72bc12 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/ThrowingSupplier.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  7429d18226 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
>  fe64a91b56 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
>  4d7f7c1220 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
>  a338bd4032 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/events/TestPreReadTableEvent.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69642/diff/4/
> 
> 
> Testing
> ---
> 
> Unit tests.
> Manual performance test with Cloudera BDR to notice improved backup 
> performance.
> 
> 
> Thanks,
> 
> Karthik Manamcheri
> 
>



Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-25 Thread Karthik Manamcheri via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/#review212336
---


Ship it!




Ship It!

- Karthik Manamcheri


On Jan. 15, 2019, midnight, Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69664/
> ---
> 
> (Updated Jan. 15, 2019, midnight)
> 
> 
> Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.
> 
> 
> Bugs: HIVE-21077
> https://issues.apache.org/jira/browse/HIVE-21077
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21077 : Database and Catalogs should have creation time
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
>  3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
>  994797698a379e0b08604d73d2d6728a2fcee4df 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  13e287e352bdbfe5263b058e1b430af8613fe815 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  8f149d1d6e2a5b9571eeef3c05d68834e4035172 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  9e5f0860f2b0e8caa9abf213e2a2c91b8e16d985 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 9576f8775a4a8a314e09462cbaaaeaebd3b4921f 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e79404a15894aa42f451df5d18ed3e4c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d43c0c1e70cffbebd39b05f89ec396227c58ac77 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
>  f3d2182a04ab81417a4ba58d9340721513e8 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
>  e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
>  815b39c483b2233660310983d58194fb1ab2d107 
>   standalone-metastore/metastore-server/src/main/resources/package.jdo 
> caaec457194332a99d5cd57bef746e969dd38161 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  a3c4196dbff7e53be5317631b314983d16a99020 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  bcaebd18accf86846ae44a6498046514575fc069 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  edde08db9ef7ee01800c7cc3a04c813014abdd18 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  701acb00984c61f7511dcc48053890b154575d1f 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  b1980c5b83f16614845063516495188ebdd8c2a3 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  b9f63313251ab1fa6278b862ed9e07e62b234c04 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  0c36069d071d4b60cc338ba729da5d22e08ca8ca 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  bb20d9f42a855100397140f9e018c04c5f61dde7 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestCatalogs.java
>  28eb1fadca80dfd3c962e4163120b83f00410c4a 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
>  d323ac6c90ed20f092b4e179fdb1bed8602ecf63 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/tools/TestSchemaToolForMetastore.java
>  c2eb6c9e22a22f09cc1d2cc6394aa4e0e339b63a 
> 
> 
> Diff: https://reviews.apache.org/r/69664/diff/7/
> 
> 
> 

Re: Review Request 69834: HIVE-21083: Removed the truststore location property requirement and removed the warnings on the truststore password property

2019-01-25 Thread Karthik Manamcheri via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69834/#review212335
---




standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
Line 1082 (original)


Add a test to make sure that we don't throw an exception if the truststore 
path and password is empty.


- Karthik Manamcheri


On Jan. 25, 2019, 1:38 a.m., Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69834/
> ---
> 
> (Updated Jan. 25, 2019, 1:38 a.m.)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Na Li, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-21083
> https://issues.apache.org/jira/browse/HIVE-21083
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It was identified that a valid way of configuring TLS is by using the Java 
> default truststore and directly adding the trusted certificates to it. The 
> previous HMS implementation did not support this.
>   
> Modified the TLS properties in the following ways:
> - Removed the requirement for metastore.dbaccess.ssl.truststore.path. If the 
> user does not specify a custom one, then it will default to the Java 
> truststore.
> - Removed the logs / warnings on metastore.dbaccess.ssl.truststore.password. 
> This used to generate a lot of noise if the user did not provide one. Also, 
> the contents of the truststore is certificates, which is public information 
> and doesn't require strict security.
> - Removed the unit test that checks for an empty truststore path.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  75f0c0a356f3b894408aa54b9cce5220d47d7f26 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9f721243c94d48eef35acdcbd0c2e143ab6d23ec 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  29738ba19b0d5ed9ec224d2288c0c1c922d0674c 
> 
> 
> Diff: https://reviews.apache.org/r/69834/diff/2/
> 
> 
> Testing
> ---
> 
> - Existing unit test coverage
> - Manual testing by verifying that these properties can configure TLS to a 
> MySQL DB
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



Re: [DISCUSS] Consistent Timestamps across Hadoop

2019-01-25 Thread Zoltan Ivanfi
Dear Hive Developers,

I would like to briefly summarize an offline discussion a few of us had
about the timestamp harmonization proposal . Please
let me know if you agree with the outcome or if you have any concerns or
questions.

[Meeting notes]

Participants: Owen O'Malley and Jesús Camacho Rodríguez from Hive, Anna
Szonyi and Zoltan Ivanfi representing the original proposal.

Owen and Jesús reasoned that the TIMESTAMP type must have the same
semantics in all file formats in Hive.

Anna and Zoltan reasoned that different Hive versions (and other components
as well) must be able to correctly read timestamps written by each other,
but there is a historical partice of normalizing to UTC in selected file
formats and eliminating that practice would be a breaking change.

Owen and Jesús suggested a solution that would change the semantics without
eliminating the partice of normalizing to UTC. This makes this solution
completely backwards- and forwards-compatible. The solution involves
recording the session-local local time zone in the file metadata fields
that allow arbitrary key-value storage. When reading back files with this
time zone metadata, newer Hive versions (or any other new component aware
of this extra metadata) can achieve LocalDateTime semantics by converting
from UTC to the saved time zone (instead of to the local time zone). Legacy
components that are unaware of the new metadata can read the files without
any problem and the timestamps will show the historical Instant behaviour
to them.

[End of meeting notes]

Since this solution achieves both goals, I have updated the proposal
 accordingly. Hopefully with this change we have
resolved any remaining disagreements. I will wait a little for feedback and
if everybody is fine with the updated proposal, I will move forward and ask
the affected components to incorporate it into their plans.

Thanks,

Zoltan


On Fri, Jan 11, 2019 at 4:53 PM Zoltan Ivanfi  wrote:

> Hi,
>
> My past experience with fixing timestamps was that no fix was going to
> work if even one of major SQL engines of the Hadoop stack disagreed
> with the approach and was not willing to implement it. For this
> reason, we can't just add a specification to Hive, we need an
> agreement from said communities.
>
> Indeed, there are more projects than these three that deal with
> timestamps, but these are the ones that deal with them on the SQL
> level and this proposal is about the semantics of the SQL timestamp
> types. I have planned to write a small summary from the file format
> perspective as well and send it to affected groups. I had Avro,
> Parquet, ORC, Arrow and Kudu in mind. Based on your suggestion, I will
> add Iceberg to that list.
>
> While I agree that a Google Doc would not be adequate for the final
> version of the plan, I think it is a better tool for doing the review
> and the design discussion than the individual mailing lists, for the
> following reasons:
>
> - It allows separate discussions around separate parts of the proposal
> and these discussion can happen in context (they are tied to specific
> parts of the document).
> - It allows adding suggestions to the proposal, in-context and
> immediately visible to everyone.
> - Most importantly, it is equally accessible to the Hive, Spark and
> Impala communities, therefore allows a real cross-component
> discussion.
>
> Br,
>
> Zoltan
>
> On Fri, Jan 11, 2019 at 12:10 AM Owen O'Malley 
> wrote:
> >
> > -- Forwarded message -
> > From: Owen O'Malley 
> > Date: Thu, Jan 10, 2019 at 3:09 PM
> > Subject: Re: [DISCUSS] Consistent Timestamps across Hadoop
> > To: Zoltan Ivanfi 
> >
> >
> > No, that isn't right.
> >
> > The discussion for Apache projects needs to happen in the open and not
> the private google doc that isn't archived at Apache.
> >
> > Three is a severe underestimate of the projects that care about
> timestamps. The Apache projects that care about parts of that document are:
> >
> > Avro
> > Hive
> > Iceberg
> > Impala
> > ORC
> > Parquet
> > Spark
> >
> > That said, Hive needs to make its decisions about what the semantics of
> Hive should be. Impala, Iceberg, and Spark may make separate choices. Avro,
> ORC, and Parquet need their bindings for each engine need to agree with the
> semantics for that engine.
> >
> > My point is that Hive should have a page that describes its current
> semantics with respect to timestamps, but those discussions need to happen
> on the Hive list and result in documents in the Hive wiki. Hive can't tell
> other projects what to do, but by clarifying their semantics it makes
> inter-operation better. In my opinion, Spark SQL should move to local date
> time semantics for timestamp. But they should want to do that to make
> themselves more compatible with the SQL standard. Clearly Hive can't force
> them to change their semantics.
> >
> > .. Owen
> >
> >
> > On Thu, Jan 10, 2019 at 8:04 AM Zoltan Ivanfi  wrote:
> >>

Re: [Notice] hive-bak cannot be pushed to github

2019-01-25 Thread Daniel Gruno

As per INFRA-10731 I'll just put that repo in the bin :)
Sorry for the noise.


On 1/25/19 3:02 PM, Daniel Gruno wrote:

Hi Hive folks,
This is a notice that we are unable to move hive-bak.git to 
gitbox/github because it contains a file 
(itests/thirdparty/spark-1.2.0-bin-hadoop2-without-hive.tgz) that is > 
100MB. While we can probably have it hosted on gitbox, we cannot get it 
to be hosted on github in its current form.


Please, either address this issue by filtering out that file, or advise 
us that you are either okay with retiring/removing the repo or having it 
on gitbox only.


With regards,
Daniel.




[Notice] hive-bak cannot be pushed to github

2019-01-25 Thread Daniel Gruno

Hi Hive folks,
This is a notice that we are unable to move hive-bak.git to 
gitbox/github because it contains a file 
(itests/thirdparty/spark-1.2.0-bin-hadoop2-without-hive.tgz) that is > 
100MB. While we can probably have it hosted on gitbox, we cannot get it 
to be hosted on github in its current form.


Please, either address this issue by filtering out that file, or advise 
us that you are either okay with retiring/removing the repo or having it 
on gitbox only.


With regards,
Daniel.


[jira] [Created] (HIVE-21166) Keyword as column name in DBS table of Hive

2019-01-25 Thread Vamsi UCSS (JIRA)
Vamsi UCSS created HIVE-21166:
-

 Summary: Keyword as column name in DBS table of Hive
 Key: HIVE-21166
 URL: https://issues.apache.org/jira/browse/HIVE-21166
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Vamsi UCSS


The table "DBS" in hive schema (metastore) has a column called "DESC" which is 
a Hive keyword. This is causing any queries on this table to result in a syntax 
error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Hive Confluence access

2019-01-25 Thread Mani M
HI Lefty,

Kindly provide me the Hive confluence access to update the below document.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inFunctions

My userid : rmsm...@gmail.com
With Regards
M.Mani