Re: Review Request 61543: HIVE-17283

2017-08-10 Thread Gopal V

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61543/#review182560
---


Ship it!




Ship It!

- Gopal V


On Aug. 10, 2017, 6:54 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61543/
> ---
> 
> (Updated Aug. 10, 2017, 6:54 a.m.)
> 
> 
> Review request for hive, Gopal V and Jason Dere.
> 
> 
> Bugs: HIVE-17283
> https://issues.apache.org/jira/browse/HIVE-17283
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Enable parallel edges of semijoin along with mapjoins
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7cee344295 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 5614c26819 
>   ql/src/test/queries/clientpositive/dynamic_semijoin_reduction.q d631401760 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 
> 2eedb6efb3 
> 
> 
> Diff: https://reviews.apache.org/r/61543/diff/2/
> 
> 
> Testing
> ---
> 
> Unit test added.
> Pending ptests.
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



[jira] [Created] (HIVE-17288) LlapOutputFormatService: Increase netty event loop threads for

2017-08-10 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-17288:
---

 Summary: LlapOutputFormatService: Increase netty event loop 
threads for
 Key: HIVE-17288
 URL: https://issues.apache.org/jira/browse/HIVE-17288
 Project: Hive
  Issue Type: Improvement
Reporter: Rajesh Balamohan
Priority: Minor


Currently it is set to 1 which would be used for parent both acceptor and 
client groups. It would be good to leave it at default, which sets the number 
of threads to "number of processors * 2". It can be modified later via 
{{-Dio.netty.eventLoopThreads}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 61543: HIVE-17283

2017-08-10 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61543/
---

(Updated Aug. 10, 2017, 6:54 a.m.)


Review request for hive, Gopal V and Jason Dere.


Changes
---

Implemented review comments


Bugs: HIVE-17283
https://issues.apache.org/jira/browse/HIVE-17283


Repository: hive-git


Description
---

Enable parallel edges of semijoin along with mapjoins


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7cee344295 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 5614c26819 
  ql/src/test/queries/clientpositive/dynamic_semijoin_reduction.q d631401760 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 
2eedb6efb3 


Diff: https://reviews.apache.org/r/61543/diff/2/

Changes: https://reviews.apache.org/r/61543/diff/1-2/


Testing
---

Unit test added.
Pending ptests.


Thanks,

Deepak Jaiswal



[GitHub] hive pull request #213: HIVE-17212: Dynamic add partition by insert shouldn'...

2017-08-10 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/213


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[DISCUSSION] WebUI query plan graphs

2017-08-10 Thread Karen Coppage
Hi all,

I’m working on a feature of the Hive WebUI Query Plan tab that would
provide the option to display the query plan as a nice graph (scroll down
for screenshots). If you click on one of the graph’s stages, the plan for
that stage appears as text below.

Stages are color-coded if they have a status (Success, Error, Running), and
the rest are grayed out. Coloring is based on status already available in
the WebUI, under the Stages tab.

There is an additional option to display stats for MapReduce tasks. This
includes the job’s ID, tracking URL (where the logs are found), and mapper
and reducer numbers/progress, among other info.

The library I’m using for the graph is called vis.js (http://visjs.org/).
It has an Apache license, and the only necessary file to be included from
this library is about 700 KB.

I tried to keep server-side changes minimal, and graph generation is taken
care of by the client. Plans with more than a given number of stages
(default: 25) won't be displayed in order to preserve resources.

I’d love to hear any and all input from the community about this feature:
do you think it’s useful, and is there anything important I’m missing?

Thanks,

Karen Coppage

*

A completely successful query:

[image: Inline image 1]


A MapReduce task selected, with MapReduce stats view on:

[image: Inline image 2]


Full MapReduce stats, lacking some information because the query was run in
local mode:

[image: Inline image 3]


A non-MapReduce stage selected:

[image: Inline image 4]


Last stage running:

[image: Inline image 5]


Last stage returns error:

[image: Inline image 6]


[DISCUSSION] WebUI query plan graphs

2017-08-10 Thread Karen Coppage
I'm resending this with a link to pictures, as they could not be embedded.

Screenshots here:
https://drive.google.com/drive/folders/0B0gDaJsjA3cxMUV5SW5VQnh4aGM



Hi all,

I’m working on a feature of the Hive WebUI Query Plan tab that would
provide the option to display the query plan as a nice graph (scroll down
for link to screenshots). If you click on one of the graph’s stages, the
plan for that stage appears as text below.

Stages are color-coded if they have a status (Success, Error, Running), and
the rest are grayed out. Coloring is based on status already available in
the WebUI, under the Stages tab.

There is an additional option to display stats for MapReduce tasks. This
includes the job’s ID, tracking URL (where the logs are found), and mapper
and reducer numbers/progress, among other info.

The library I’m using for the graph is called vis.js (http://visjs.org/).
It has an Apache license, and the only necessary file to be included from
this library is about 700 KB.

I tried to keep server-side changes minimal, and graph generation is taken
care of by the client. Plans with more than a given number of stages
(default: 25) won't be displayed in order to preserve resources.

I’d love to hear any and all input from the community about this feature:
do you think it’s useful, and is there anything important I’m missing?

Thanks,

Karen Coppage


Re: [DISCUSSION] WebUI query plan graphs

2017-08-10 Thread Xuefu Zhang
Hi Karen,

Thanks for reaching out. While your message doesn't seem showing any
images, I think the feature would be a great addition to Hive. (Hive
community always welcomes contributions like this.)

Please feel free to create an JIRA for easier discussion and tracking.

Thanks again for your interest.

--Xuefu

On Thu, Aug 10, 2017 at 6:25 AM, Karen Coppage 
wrote:

> Hi all,
>
> I’m working on a feature of the Hive WebUI Query Plan tab that would
> provide the option to display the query plan as a nice graph (scroll down
> for screenshots). If you click on one of the graph’s stages, the plan for
> that stage appears as text below.
>
> Stages are color-coded if they have a status (Success, Error, Running), and
> the rest are grayed out. Coloring is based on status already available in
> the WebUI, under the Stages tab.
>
> There is an additional option to display stats for MapReduce tasks. This
> includes the job’s ID, tracking URL (where the logs are found), and mapper
> and reducer numbers/progress, among other info.
>
> The library I’m using for the graph is called vis.js (http://visjs.org/).
> It has an Apache license, and the only necessary file to be included from
> this library is about 700 KB.
>
> I tried to keep server-side changes minimal, and graph generation is taken
> care of by the client. Plans with more than a given number of stages
> (default: 25) won't be displayed in order to preserve resources.
>
> I’d love to hear any and all input from the community about this feature:
> do you think it’s useful, and is there anything important I’m missing?
>
> Thanks,
>
> Karen Coppage
>
> *
>
> A completely successful query:
>
> [image: Inline image 1]
>
>
> A MapReduce task selected, with MapReduce stats view on:
>
> [image: Inline image 2]
>
>
> Full MapReduce stats, lacking some information because the query was run in
> local mode:
>
> [image: Inline image 3]
>
>
> A non-MapReduce stage selected:
>
> [image: Inline image 4]
>
>
> Last stage running:
>
> [image: Inline image 5]
>
>
> Last stage returns error:
>
> [image: Inline image 6]
>


[jira] [Created] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-10 Thread Peter Vary (JIRA)
Peter Vary created HIVE-17291:
-

 Summary: Set the number of executors based on config if client 
does not provide information
 Key: HIVE-17291
 URL: https://issues.apache.org/jira/browse/HIVE-17291
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 3.0.0
Reporter: Peter Vary
Assignee: Peter Vary


When calculating the memory and cores and the client does not provide 
information we should try to use the one provided by default. This can happen 
on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17295) Make MemoryManagerImpl.ROWS_BETWEEN_CHECKS configurable

2017-08-10 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-17295:
-

 Summary: Make MemoryManagerImpl.ROWS_BETWEEN_CHECKS configurable
 Key: HIVE-17295
 URL: https://issues.apache.org/jira/browse/HIVE-17295
 Project: Hive
  Issue Type: Bug
Reporter: Eugene Koifman
Assignee: Eugene Koifman


currently addedRow() looks like
{noformat}
public void addedRow(int rows) throws IOException {
rowsAddedSinceCheck += rows;
if (rowsAddedSinceCheck >= ROWS_BETWEEN_CHECKS) {
  notifyWriters();
}
  }
{noformat}

it would be convenient for testing to set ROWS_BETWEEN_CHECKS to a low value so 
that we can generate multiple stripes with very little data.

Currently the only way to do this is to create a new MemoryManager that 
overrides this method and install it via OrcFile.WriterOptions but this only 
works when you have control over creating the Writer.

There is no way to do this via some set of config params to make Hive query for 
example, create multiple stripes with little data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Review Request 61586: HIVE-17286

2017-08-10 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61586/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-17286
https://issues.apache.org/jira/browse/HIVE-17286


Repository: hive-git


Description
---

HIVE-17286


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/ndv/FMSketch.java 
160ce663ba5068ad029d51f431b87a40f97341a5 
  
common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimator.java
 4517b694ee38fd02ff32430ef24720d91c092f3a 
  
common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimatorFactory.java
 6a29859df56f7d41b87d95242367f6ef401b4060 
  common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketchUtils.java 
b6f7fdda0c9363516f610d5280c3c4e0821cba09 
  common/src/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java 
182560afbe6a8b9e2ac10591a520603bac0eb8d5 
  
common/src/test/org/apache/hadoop/hive/common/ndv/fm/TestFMSketchSerialization.java
 74fdf58d2d3640a9e923e3dbd96f0704ba3b5f35 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
73754fff857367f78793cbe5d976b5386a60b0b3 
  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
d53ea4c5b2f294bbd4d530ce06a9a3fd1e640bde 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
 6fae3e50673aa1d4a46b1487b62c2381effdafa7 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DecimalColumnStatsAggregator.java
 c5e72ebd43a3e036c5aa03b70afcf8531ba83d32 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DoubleColumnStatsAggregator.java
 e55c41230d5302de5e820649394797048b503f1c 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/LongColumnStatsAggregator.java
 2ee09f3c47da0777dcebdc34b710335b8e66e50c 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
 2ea2fcca05de38f630592f4d4c309c1579ca04af 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DateColumnStatsMerger.java
 2542a00d361b6b2be532b9288b5958c680fa40fa 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DecimalColumnStatsMerger.java
 4e8e1297585c18edc40fa71a3e33c0310b99750a 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DoubleColumnStatsMerger.java
 4ef5c39d1c107a329023f38769b60045b7499fb4 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/LongColumnStatsMerger.java
 acf7f03c72adb64696f5d7c3e0cd1bb64e34f7fd 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/StringColumnStatsMerger.java
 b3cd33c671ec8b478e233124671627238288a4e3 
  metastore/src/test/org/apache/hadoop/hive/metastore/TestOldSchema.java 
54828f2289657470e1da02cdb938dd14ef6488af 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
d96f432fee9f4dde4ba76a21a28005299d3f9f7a 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 23800734f7f7a00e363b367db2e197bd3b4f640a 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 
8ee41bfab2b44e6c8b345fe5d9e3f6b30d2a328c 
  ql/src/test/results/clientpositive/autoColumnStats_4.q.out 
e8445b2c5cae1d84ef4f37532423dd0eac9b 
  ql/src/test/results/clientpositive/autoColumnStats_5.q.out 
29963975d38aa89a96744645a097278b7ce64f1f 
  ql/src/test/results/clientpositive/autoColumnStats_6.q.out 
1b125701d7c92fea66d1e6234e9a86a7b1922b17 
  ql/src/test/results/clientpositive/autoColumnStats_7.q.out 
9e2121e0deae1deb37b7f1222c22aea8c6839f48 
  ql/src/test/results/clientpositive/autoColumnStats_8.q.out 
cdf2082d53356d18ed445b0d04b6673133878b2d 
  ql/src/test/results/clientpositive/autoColumnStats_9.q.out 
e32c884c7d77b739f9ec114ae9eeb08da6aabb92 
  ql/src/test/results/clientpositive/char_udf1.q.out 
e701d64357aadca542d82caba9a19f0d5392c328 
  ql/src/test/results/clientpositive/column_pruner_multiple_children.q.out 
00e53dc3e9a9dd375f341f0beb0a794c3f201166 
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out 
c0f007159d3c4d67eb61077cfcde670e8abad77d 
  ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out 
0cb4863a17f566a3f336fcf414428817281d54a9 
  ql/src/test/results/clientpositive/columnstats_quoting.q.out 
7e080fec9bb21aeede72b17141e7c4ce0c1c11cd 
  ql/src/test/results/clientpositive/columnstats_tbllvl.q.out 
b85c1ff721d61be540e951ff5d60c1751da17b09 
  ql/src/test/results/clientpositive/compute_stats_date.q.out 
78d04f9dfcb85c59c443158de17e3773a5c92564 
  ql/src/test/results/clientpositive/compute_stats_decimal.q.out 
e18b9890623d8cafd89c7eea59c0ace762f8e3b2 
  ql/src/test/results/clientpositive/compute_stats_double.q.out 
d937c3a00292fc850c8dff54721fffec82a9bc00 
  ql/src/test/results/clientpositive/compute_stats_empty_table.q.out 

[jira] [Created] (HIVE-17296) Acid tests with multiple splits

2017-08-10 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-17296:
-

 Summary: Acid tests with multiple splits
 Key: HIVE-17296
 URL: https://issues.apache.org/jira/browse/HIVE-17296
 Project: Hive
  Issue Type: Test
  Components: Transactions
Affects Versions: 3.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical


data files in an Acid table are ORC files which may have multiple stripes
for such files in base/ or delta/ (and original files with non acid to acid 
conversion) are split by OrcInputFormat into multiple (stripe sized) chunks.
There is additional logic in in OrcRawRecordMerger 
(discoverKeyBounds/discoverOriginalKeyBounds) that is not tested by any E2E 
tests since none of the have enough data to generate multiple stripes in a 
single file.

testRecordReaderOldBaseAndDelta/testRecordReaderNewBaseAndDelta/testOriginalReaderPair
in TestOrcRawRecordMerger has some logic to test this but it really needs e2e 
tests.

With ORC-228 it will be possible to write such tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-10 Thread Peter Vary (JIRA)
Peter Vary created HIVE-17292:
-

 Summary: Change TestMiniSparkOnYarnCliDriver test configuration to 
use the configured cores
 Key: HIVE-17292
 URL: https://issues.apache.org/jira/browse/HIVE-17292
 Project: Hive
  Issue Type: Sub-task
  Components: Spark, Test
Affects Versions: 3.0.0
Reporter: Peter Vary
Assignee: Peter Vary


Currently the {{hive-site.xml}} for the {{TestMiniSparkOnYarnCliDriver}} test 
defines 2 cores, and 2 executors, but only 1 is used, because the MiniCluster 
does not allows the creation of the 3rd container.

The FairScheduler uses 1GB increments for memory, but the containers would like 
to use only 512MB. We should change the fairscheduler configuration to use only 
the requested 512MB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17293) ETL split strategy not accounting for empty base and non-empty delta buckets

2017-08-10 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-17293:


 Summary: ETL split strategy not accounting for empty base and 
non-empty delta buckets
 Key: HIVE-17293
 URL: https://issues.apache.org/jira/browse/HIVE-17293
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 3.0.0, 2.4.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical


Observed an issue with customer case where there are 2 buckets (bucket_0 
and bucket_1).
Based bucket 0 had some rows whereas bucket 1 was empty.
Delta bucket 0 and 1 had some rows.

ETL split strategy did not generate OrcSplit for bucket 1 even though it had 
some rows in delta directories.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-10 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-17297:
---

 Summary: allow AM to use LLAP guaranteed tasks
 Key: HIVE-17297
 URL: https://issues.apache.org/jira/browse/HIVE-17297
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 61586: HIVE-17286

2017-08-10 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61586/#review182654
---




common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimatorFactory.java
Lines 39-41 (original), 38-40 (patched)


This logic can now be modified to say magic[0] = buf[0]. No need to create 
stream.



common/src/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java
Line 587 (original), 588 (patched)


Need to close stream.



ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
Lines 184 (patched)


Printing whole string is not useful. Lets print only first 2 bytes in 
buffer. That will be header : FM or HL. No need of encoding.



ql/src/test/results/clientpositive/char_udf1.q.out
Line 409 (original), 409 (patched)


surprised to see this change. Since this patch should not have changed this.



ql/src/test/results/clientpositive/compute_stats_date.q.out
Line 46 (original), 46 (patched)


doesnt look like base64encoded.


- Ashutosh Chauhan


On Aug. 10, 2017, 11:17 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61586/
> ---
> 
> (Updated Aug. 10, 2017, 11:17 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-17286
> https://issues.apache.org/jira/browse/HIVE-17286
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-17286
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/ndv/FMSketch.java 
> 160ce663ba5068ad029d51f431b87a40f97341a5 
>   
> common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimator.java
>  4517b694ee38fd02ff32430ef24720d91c092f3a 
>   
> common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimatorFactory.java
>  6a29859df56f7d41b87d95242367f6ef401b4060 
>   common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketchUtils.java 
> b6f7fdda0c9363516f610d5280c3c4e0821cba09 
>   common/src/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java 
> 182560afbe6a8b9e2ac10591a520603bac0eb8d5 
>   
> common/src/test/org/apache/hadoop/hive/common/ndv/fm/TestFMSketchSerialization.java
>  74fdf58d2d3640a9e923e3dbd96f0704ba3b5f35 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> 73754fff857367f78793cbe5d976b5386a60b0b3 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
> d53ea4c5b2f294bbd4d530ce06a9a3fd1e640bde 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
>  6fae3e50673aa1d4a46b1487b62c2381effdafa7 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DecimalColumnStatsAggregator.java
>  c5e72ebd43a3e036c5aa03b70afcf8531ba83d32 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DoubleColumnStatsAggregator.java
>  e55c41230d5302de5e820649394797048b503f1c 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/LongColumnStatsAggregator.java
>  2ee09f3c47da0777dcebdc34b710335b8e66e50c 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
>  2ea2fcca05de38f630592f4d4c309c1579ca04af 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DateColumnStatsMerger.java
>  2542a00d361b6b2be532b9288b5958c680fa40fa 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DecimalColumnStatsMerger.java
>  4e8e1297585c18edc40fa71a3e33c0310b99750a 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DoubleColumnStatsMerger.java
>  4ef5c39d1c107a329023f38769b60045b7499fb4 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/LongColumnStatsMerger.java
>  acf7f03c72adb64696f5d7c3e0cd1bb64e34f7fd 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/StringColumnStatsMerger.java
>  b3cd33c671ec8b478e233124671627238288a4e3 
>   metastore/src/test/org/apache/hadoop/hive/metastore/TestOldSchema.java 
> 54828f2289657470e1da02cdb938dd14ef6488af 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
> d96f432fee9f4dde4ba76a21a28005299d3f9f7a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
>  23800734f7f7a00e363b367db2e197bd3b4f640a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
>  

[jira] [Created] (HIVE-17294) LLAP: switch node heartbeats to protobuf

2017-08-10 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-17294:
---

 Summary: LLAP: switch node heartbeats to protobuf
 Key: HIVE-17294
 URL: https://issues.apache.org/jira/browse/HIVE-17294
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17289) IMPORT should copy files without using doAs user.

2017-08-10 Thread Sankar Hariappan (JIRA)
Sankar Hariappan created HIVE-17289:
---

 Summary: IMPORT should copy files without using doAs user.
 Key: HIVE-17289
 URL: https://issues.apache.org/jira/browse/HIVE-17289
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2, repl
Affects Versions: 3.0.0
Reporter: Sankar Hariappan
Assignee: Sankar Hariappan
 Fix For: 3.0.0


Currently, IMPORT uses distcp to copy the larger files/large number of files 
from dump directory to table staging directory. But, this copy fails as distcp 
is always done with doAs user specified in hive.distcp.privileged.doAs, which 
is "hdfs' by default.
Need to remove usage of doAs user when try to distcp from IMPORT flow.
Also, need to set the default config for hive.distcp.privileged.doAs to "hive" 
as "hdfs" super-user is never allowed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17290) Should use equals() rather than == to compare strings

2017-08-10 Thread Oleg Danilov (JIRA)
Oleg Danilov created HIVE-17290:
---

 Summary: Should use equals() rather than == to compare strings
 Key: HIVE-17290
 URL: https://issues.apache.org/jira/browse/HIVE-17290
 Project: Hive
  Issue Type: Bug
Reporter: Oleg Danilov
Priority: Trivial


There are number of places, where strings have been compared with == or !=. 
Seems like it works now, thanks to string interning, but it would be better not 
to be relied upon.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] hive pull request #226: HIVE-17290: Use equals() rather than == to compare s...

2017-08-10 Thread dosoft
GitHub user dosoft opened a pull request:

https://github.com/apache/hive/pull/226

HIVE-17290: Use equals() rather than == to compare strings



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dosoft/hive HIVE-17290

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #226


commit 6cc3f0b952b701305037cd861e9bfbb3f604c023
Author: Oleg Danilov 
Date:   2017-08-10T12:55:10Z

HIVE-17290: Use equals() rather than == to compare strings




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


adding the column to hive partition

2017-08-10 Thread mallik new
Hello All,

 how to add the column to already partitioned table in hive, after i have
added the column using cascade, the new column is showing null values, how
to eliminate nulls and how this column pick the values.


 looking for your support.

thanks,
Mallik.


[jira] [Created] (HIVE-17298) export when running distcp for large number of files runs as privileged user from hiveconf

2017-08-10 Thread anishek (JIRA)
anishek created HIVE-17298:
--

 Summary: export when running distcp for large number of files runs 
as privileged user from hiveconf
 Key: HIVE-17298
 URL: https://issues.apache.org/jira/browse/HIVE-17298
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.0.0
Reporter: anishek
Assignee: anishek
 Fix For: 3.0.0


Export command when encounters a large number of files or large size of files 
it invokes distcp. 

distcp is run as privileged user with user taken from config 
hive.distcp.privileged.doAs, this should not be the case, it should not run 
distcp as privileged user.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)