[jira] [Created] (HIVE-20174) Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions

2018-07-13 Thread Matt McCline (JIRA)
Matt McCline created HIVE-20174:
---

 Summary: Vectorization: Fix NULL / Wrong Results issues in GROUP 
BY Aggregation Functions
 Key: HIVE-20174
 URL: https://issues.apache.org/jira/browse/HIVE-20174
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline


Write new UT tests that use random data and intentional isRepeating batches to 
checks for NULL and Wrong Results for vectorized aggregation functions:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20173) MetaStoreDirectSql#executeWithArray should not catch RuntimeExceptions from JDO

2018-07-13 Thread Aaron Gottlieb (JIRA)
Aaron Gottlieb created HIVE-20173:
-

 Summary: MetaStoreDirectSql#executeWithArray should not catch 
RuntimeExceptions from JDO
 Key: HIVE-20173
 URL: https://issues.apache.org/jira/browse/HIVE-20173
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.1.0
Reporter: Aaron Gottlieb


When attempting to test the existence of a Hive database, the Metastore will 
query the backing database.

The method MetaStoreDirectSql#executeWithArray catches all exceptions, turning 
them into MetaExceptions.

Further up the stack, the ObjectStore#getDatabase explicitly catches 
MetaExceptions and turns them into NoSuchObjectExceptions.

Finally, RetryingHMSHandler explicitly looks for NoSuchObjectExceptions and 
does _not_ retry them, thinking they are legitimate answers.

If the exception in MetaStoreDirectSql#executeWithArray was a runtime 
JDOException due to, say, some sort of network error between the Metastore and 
the backing database, this inability to query the backing database looks just 
like an answer of "no database exists" higher up the stack.  Any program 
depending on this information will continue with an incorrect answer rather 
than retrying the original getDatabase query.

I am unsure the extent of the effects of this, but I imagine that explicitly 
_not_ catching RuntimeExceptions in MetaStoreDirectSql#executeWithArray will 
allow the exception to raise all the way up to the RetryingHMSHandler which 
will, correctly, retry the operation.

Would allowing RuntimeExceptions to be thrown from 
MetaStoreDirectSql#executeWithArray be too deleterious?  Or did I miss some 
code path such that my observations are incorrect?

 

Thanks,

Aaron



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #400: HIVE-20172: StatsUpdater failed with GSS Exception w...

2018-07-13 Thread rajkrrsingh
GitHub user rajkrrsingh opened a pull request:

https://github.com/apache/hive/pull/400

HIVE-20172: StatsUpdater failed with GSS Exception while trying to co…

since metastore client is running in HMS so there is no need to connect to 
remote URI, so a part of this PR I will be updating the metastore URI so that 
it connects in embedded mode.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajkrrsingh/hive HIVE-20172

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/400.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #400


commit 3efc2d9ba96822101b30c645d746849e772e478c
Author: Rajkumar singh 
Date:   2018-07-13T21:17:40Z

HIVE-20172: StatsUpdater failed with GSS Exception while trying to connect 
to remote metastore




---


[jira] [Created] (HIVE-20172) StatsUpdater failed with GSS Exception while trying to connect to remote metastore

2018-07-13 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20172:
-

 Summary: StatsUpdater failed with GSS Exception while trying to 
connect to remote metastore
 Key: HIVE-20172
 URL: https://issues.apache.org/jira/browse/HIVE-20172
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.1
 Environment: Hive-1.2.1,Hive2.1,java8
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


StatsUpdater task failed with GSS Exception while trying to connect to remote 
Metastore.
{code}
org.apache.thrift.transport.TTransportException: GSS initiate failed 
at 
org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
 
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) 
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
 
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
 
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:487)
 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
 
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564)
 
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:92)
 
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138)
 
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110)
 
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3526) 
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3558) 
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:533) 
at 
org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:300)
 
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265) 
at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:177) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
 
at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) 
) 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:534)
 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
 
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
 
{code}
since metastore client is running in HMS so there is no need to connect to 
remote URI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20171) Make hive.stats.autogather Per Table

2018-07-13 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20171:
--

 Summary: Make hive.stats.autogather Per Table
 Key: HIVE-20171
 URL: https://issues.apache.org/jira/browse/HIVE-20171
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Standalone Metastore
Affects Versions: 3.0.0, 4.0.0
Reporter: BELUGA BEHR


{{hive.stats.autogather}}
{{hive.stats.column.autogather}}

https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties

These are currently global-level settings.  Make these global setting the 
'default' values for tables but allow for these configurations to be override 
by the table's properties.

Recently started seeing tables backed by S3 that are not regularly queried but 
that the CREATE TABLE is very slow to collect the stats (30+ minutes) for all 
of the files in the table.  We would like to turn this feature off for certain 
S3 tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67887: HIVE-20090

2018-07-13 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67887/
---

(Updated July 13, 2018, 4:04 p.m.)


Review request for hive, Ashutosh Chauhan, Deepak Jaiswal, and Gopal V.


Bugs: HIVE-20090
https://issues.apache.org/jira/browse/HIVE-20090


Repository: hive-git


Description
---

HIVE-20090


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
6ea68c35000a5dadb7a01db47bbd8183bff966da 
  itests/src/test/resources/testconfiguration.properties 
4001b9f452f9dbeaff31c2e766334259605a51af 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 
119aa925c1a71502e649b4f2d193a7ff974263c1 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
dec2d1ef38b748a5c9b40d06af491dd168d70b72 
  ql/src/test/queries/clientpositive/dynamic_semijoin_reduction_sw2.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_sw2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out 
f87fe36e11a7c7e535678dbfaaced04f33bbb501 
  ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out 
6987a96809e3c3300e1b76ea5df3069b3c1d162f 
  ql/src/test/results/clientpositive/perf/tez/query1.q.out 
579940c66e25ebf5e7d0635aaedd0c0cc994f4e0 
  ql/src/test/results/clientpositive/perf/tez/query16.q.out 
0b64c55b0f4ba036aeba4c49f478e9ee1409087c 
  ql/src/test/results/clientpositive/perf/tez/query17.q.out 
2e5e254b2ddc3507f962cbc7691db51f1abafbca 
  ql/src/test/results/clientpositive/perf/tez/query18.q.out 
e8585275b4e51a55ce778dd154033fcdf859e617 
  ql/src/test/results/clientpositive/perf/tez/query2.q.out 
d24899ccf371ad42ef88cebc26cc671c097686da 
  ql/src/test/results/clientpositive/perf/tez/query23.q.out 
6725bec30106bc3321c2869dfc304d0a4da82cf8 
  ql/src/test/results/clientpositive/perf/tez/query24.q.out 
9fcec42c3ab29b898c9c947544a2e29dd08e95e8 
  ql/src/test/results/clientpositive/perf/tez/query25.q.out 
a885cf344b7e29dcf1b2d93d1914e7f9a8d4b921 
  ql/src/test/results/clientpositive/perf/tez/query29.q.out 
46ff49d41a01591f075b2c48ae5a692640fd6eec 
  ql/src/test/results/clientpositive/perf/tez/query31.q.out 
c4d717d8680f6ac6f8f8b6ed01742384a84ddcf9 
  ql/src/test/results/clientpositive/perf/tez/query32.q.out 
6be6f7aa6e6fc50bcedebe3f4d1b5fc00b52ee86 
  ql/src/test/results/clientpositive/perf/tez/query39.q.out 
5966e243ea79b4b884950f34a5b7336e40f92889 
  ql/src/test/results/clientpositive/perf/tez/query40.q.out 
2f116f12ebcba44b876508d0d0f0d827e3a8b28d 
  ql/src/test/results/clientpositive/perf/tez/query54.q.out 
8ab239ce260fb37d988d956fcb9e4eb98a3aeb88 
  ql/src/test/results/clientpositive/perf/tez/query59.q.out 
6b2dcc38737cfc9b955cca1d5b1ac99a7901370b 
  ql/src/test/results/clientpositive/perf/tez/query64.q.out 
a673b9f753a641e111e30a7a4427206d5f2c3da3 
  ql/src/test/results/clientpositive/perf/tez/query69.q.out 
a9c7ac3b21b3e0588e7df7e8c2129fc641d090f1 
  ql/src/test/results/clientpositive/perf/tez/query72.q.out 
48682e340db2916800e9bc5ad61c08c0fb4a8a8b 
  ql/src/test/results/clientpositive/perf/tez/query77.q.out 
163805b2a3dba3e4169d487bd44e7906f66e5868 
  ql/src/test/results/clientpositive/perf/tez/query78.q.out 
90b6f17e1d10ca1e3af17bc53b6df50ffa310af4 
  ql/src/test/results/clientpositive/perf/tez/query80.q.out 
816b525c301fe74460e5657d0b230287d0a6729f 
  ql/src/test/results/clientpositive/perf/tez/query91.q.out 
5e0f00a3e7321c4233f927703701051cab641fb0 
  ql/src/test/results/clientpositive/perf/tez/query92.q.out 
061fcf729d6fa7fde52de3ccd46a800379a92211 
  ql/src/test/results/clientpositive/perf/tez/query94.q.out 
5d19a1634b4657e9ef9595891401e8831d9b0bd4 
  ql/src/test/results/clientpositive/perf/tez/query95.q.out 
400cc1958116b2347a06b52a1460320fd0e0be43 
  
ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_3.q.out
 eafc1c4a005fa2b3bc169aa4453376f5da6841bc 


Diff: https://reviews.apache.org/r/67887/diff/4/

Changes: https://reviews.apache.org/r/67887/diff/3-4/


Testing
---


Thanks,

Jesús Camacho Rodríguez



[jira] [Created] (HIVE-20170) Improve JoinOperator "rows for join key" Logging

2018-07-13 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20170:
--

 Summary: Improve JoinOperator "rows for join key" Logging
 Key: HIVE-20170
 URL: https://issues.apache.org/jira/browse/HIVE-20170
 Project: Hive
  Issue Type: Improvement
  Components: Operators
Affects Versions: 3.0.0, 4.0.0
Reporter: BELUGA BEHR


{code}
2018-06-25 09:37:33,193 INFO [main] 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 5728000 rows for 
join key [333, 22]
2018-06-25 09:37:33,901 INFO [main] 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 5828000 rows for 
join key [333, 22]
2018-06-25 09:37:34,623 INFO [main] 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 5928000 rows for 
join key [333, 22]
2018-06-25 09:37:35,342 INFO [main] 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 6028000 rows for 
join key [333, 22]
{code}

https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java#L120

This logging should use the same facilities as the other Operators for emitting 
this type of log message. [HIVE-10078]  Maybe this feature should be refactored 
into an AbstractOperator class?

Also, it should print a final count for each join value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20169) Print Final Rows Processed in MapOperator

2018-07-13 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20169:
--

 Summary: Print Final Rows Processed in MapOperator
 Key: HIVE-20169
 URL: https://issues.apache.org/jira/browse/HIVE-20169
 Project: Hive
  Issue Type: Improvement
  Components: Operators
Affects Versions: 3.0.0, 4.0.0
Reporter: BELUGA BEHR


https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java#L573-L582

This class emits a log message every time it a certain number of records are 
processed, but it does not print a final count.

Overload the {{MapOperator}} class's {{closeOp}} method to print a final log 
message providing the total number of rows read by this mapper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20168) ReduceSinkOperator Logging Hidden

2018-07-13 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20168:
--

 Summary: ReduceSinkOperator Logging Hidden
 Key: HIVE-20168
 URL: https://issues.apache.org/jira/browse/HIVE-20168
 Project: Hive
  Issue Type: Bug
  Components: Operators
Affects Versions: 3.0.0, 4.0.0
Reporter: BELUGA BEHR


[https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java]

 
{code:java}
if (LOG.isTraceEnabled()) {
  if (numRows == cntr) {
cntr = logEveryNRows == 0 ? cntr * 10 : numRows + logEveryNRows;
if (cntr < 0 || numRows < 0) {
  cntr = 0;
  numRows = 1;
}
LOG.info(toString() + ": records written - " + numRows);
  }
}

...

if (LOG.isTraceEnabled()) {
  LOG.info(toString() + ": records written - " + numRows);
}
{code}

There are logging guards here checking for TRACE level debugging but the 
logging is actually INFO.  This is important logging for detecting data skew.  
Please change guards to check for INFO... or I would prefer that the guards are 
removed altogether since it's very rare that a service is running with only 
WARN level logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20167) apostrophe in midline comment fails with ParseException

2018-07-13 Thread Trey Fore (JIRA)
Trey Fore created HIVE-20167:


 Summary: apostrophe in midline comment fails with ParseException
 Key: HIVE-20167
 URL: https://issues.apache.org/jira/browse/HIVE-20167
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 2.3.2
 Environment: Observed on an AWS EMR cluster. 

Hive cli, executing script from bash with "hive -f ..." (not interactive).

 
Reporter: Trey Fore


This line causes a ParseException:

{{    , member_id string                  --  standardizing from client's 
memberID}}

When the apostrophe is removed, leaving:

{{    , member_id string                  --  standardizing from clients 
memberID}}

the line is parsed correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-13 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20166:
--

 Summary: LazyBinaryStruct Warn Level Logging
 Key: HIVE-20166
 URL: https://issues.apache.org/jira/browse/HIVE-20166
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 3.0.0, 4.0.0
Reporter: BELUGA BEHR


https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180

{code}
// Extra bytes at the end?
if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
  extraFieldWarned = true;
  LOG.warn("Extra bytes detected at the end of the row! " +
   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
+ structByteEnd + ". " +
   "Ignoring similar problems.");
}

// Missing fields?
if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
  missingFieldWarned = true;
  LOG.info("Missing fields! Expected " + fields.length + " fields but " +
  "only got " + fieldId + "! " +
  "Last field end " + lastFieldByteEnd + " and serialize buffer end " + 
structByteEnd + ". " +
  "Ignoring similar problems.");
}
{code}

The first log statement is a 'warn' level logging, the second is an 'info' 
level logging.  Please change the second log to also be a 'warn'.  This seems 
like it could be a problem that the user would like to know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #399: HIVE-20152: reset db state, when repl dump fails, so...

2018-07-13 Thread anishek
GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/399

HIVE-20152: reset db state, when repl dump fails, so rename table can be 
done



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-20152

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/399.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #399


commit 780ebaa59627ba4954c4e72fa7e60dad2089a771
Author: Anishek Agarwal 
Date:   2018-07-13T09:53:09Z

HIVE-20152: reset db state, when repl dump fails, so rename table can be 
done




---


Re: Review Request 67895: Improve HiveMetaStoreClient.dropDatabase

2018-07-13 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67895/
---

(Updated July 13, 2018, 9:22 a.m.)


Review request for hive.


Changes
---

Rebased to top of master again...


Bugs: HIVE-18705
https://issues.apache.org/jira/browse/HIVE-18705


Repository: hive-git


Description
---

HiveMetaStoreClient.dropDatabase has a strange implementation to ensure dealing 
with client side hooks (for non-native tables e.g. HBase). Currently it starts 
by retrieving all the tables from HMS, and then sends dropTable calls to HMS 
table-by-table. At the end a dropDatabase just to be sure  

I believe this could be refactored so that it speeds up the dropDB in 
situations where the average table count per DB is very high.


Diffs (updated)
-

  hbase-handler/src/test/queries/positive/drop_database_table_hooks.q 
PRE-CREATION 
  hbase-handler/src/test/results/positive/drop_database_table_hooks.q.out 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/TableIterable.java 
d8e771d0ffa7d680b2a22436727f896674cd40ff 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestTableIterable.java 
6637d150b84c9fa86e6a3a90449606437e7c9d72 
  
service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 
838dd89ca82792ca8af8eb0f30aa63e690e41f43 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 8d88749effa89e50d8be8ed216419cd77836fd34 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 bfd7141a8b987e5288277a46d56de32574d9aa69 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/TableIterable.java
 PRE-CREATION 
  
standalone-metastore/metastore-common/src/test/java/org/apache/hadoop/hive/metastore/TestTableIterable.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/67895/diff/2/

Changes: https://reviews.apache.org/r/67895/diff/1-2/


Testing
---

Drop database is an existing feature - existing tests should be fine, but since 
I'm poking around client side hooks I've added an HBase drop db qtest so that 
code path is covered


Thanks,

Adam Szita



[jira] [Created] (HIVE-20165) Enable ZLIB for streaming ingest

2018-07-13 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20165:


 Summary: Enable ZLIB for streaming ingest
 Key: HIVE-20165
 URL: https://issues.apache.org/jira/browse/HIVE-20165
 Project: Hive
  Issue Type: Bug
  Components: Streaming, Transactions
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Per [~gopalv]'s recommendation tried running streaming ingest with and without 
zlib. Following are the numbers

 
*Compression: NONE*
Total rows committed: 9380
Throughput: *156* rows/second
[prasanth@cn105-10 culvert]$ hdfs dfs -du -s -h 
/apps/hive/warehouse/prasanth.db/culvert
*14.1 G*  /apps/hive/warehouse/prasanth.db/culvert
 
*Compression: ZLIB*
Total rows committed: 9210
Throughput: *1535000* rows/second
[prasanth@cn105-10 culvert]$ hdfs dfs -du -s -h 
/apps/hive/warehouse/prasanth.db/culvert
*7.4 G*  /apps/hive/warehouse/prasanth.db/culvert
 
ZLIB is getting us 2x compression and only 2% lesser throughput. We should 
enable ZLIB by default for streaming ingest. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)