[jira] [Created] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2017-07-27 Thread Rui Li (JIRA)
Rui Li created HIVE-17193:
-

 Summary: HoS: don't combine map works that are targets of 
different DPPs
 Key: HIVE-17193
 URL: https://issues.apache.org/jira/browse/HIVE-17193
 Project: Hive
  Issue Type: Bug
Reporter: Rui Li
Assignee: Rui Li






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17192) Add InterfaceAudience and InterfaceStability annotations for Stats Collection APIs

2017-07-27 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-17192:
---

 Summary: Add InterfaceAudience and InterfaceStability annotations 
for Stats Collection APIs
 Key: HIVE-17192
 URL: https://issues.apache.org/jira/browse/HIVE-17192
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Sahil Takiar
Assignee: Sahil Takiar






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17191) Add InterfaceAudience and InterfaceStability annotations for StorageHandler APIs

2017-07-27 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-17191:
---

 Summary: Add InterfaceAudience and InterfaceStability annotations 
for StorageHandler APIs
 Key: HIVE-17191
 URL: https://issues.apache.org/jira/browse/HIVE-17191
 Project: Hive
  Issue Type: Sub-task
  Components: StorageHandler
Reporter: Sahil Takiar
Assignee: Sahil Takiar






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17190) Don't store bitvectors for unpartitioned table

2017-07-27 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-17190:
---

 Summary: Don't store bitvectors for unpartitioned table
 Key: HIVE-17190
 URL: https://issues.apache.org/jira/browse/HIVE-17190
 Project: Hive
  Issue Type: Test
  Components: Metastore, Statistics
Affects Versions: 3.0.0
Reporter: Ashutosh Chauhan


Since current ones can't be intersected, there is no advantage of storing them 
for unpartitioned tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 61164: HIVE-17006 LLAP: Parquet caching

2017-07-27 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61164/
---

(Updated July 27, 2017, 10:27 p.m.)


Review request for hive and Prasanth_J.


Repository: hive-git


Description (updated)
---

see jira


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/FileUtils.java e8a3a7a49e 
  itests/src/test/resources/testconfiguration.properties f66e19be3e 
  llap-client/src/java/org/apache/hadoop/hive/llap/io/api/LlapIo.java 
42129b7511 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java 
0cbc8f6f4c 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 
35b9d1f942 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java 9b8b76102a 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 21394c6aab 
  
ql/src/java/org/apache/hadoop/hive/ql/io/LlapCacheOnlyInputFormatInterface.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 
f4fadbb61b 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java
 322178a2f7 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetFooterInputFromCache.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java
 6a7a219dfe 
  ql/src/test/queries/clientpositive/parquet_ppd_decimal.q dfca486241 
  ql/src/test/queries/clientpositive/parquet_predicate_pushdown.q a38cdbe007 
  ql/src/test/queries/clientpositive/parquet_types.q db37d2e1b2 
  ql/src/test/queries/clientpositive/parquet_types_vectorization.q bb0e5b258f 
  ql/src/test/queries/clientpositive/vectorized_parquet.q e6ebdaac62 
  ql/src/test/queries/clientpositive/vectorized_parquet_types.q 7467cb3cf6 
  ql/src/test/results/clientpositive/llap/parquet_types_vectorization.q.out 
PRE-CREATION 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java 
PRE-CREATION 
  
storage-api/src/java/org/apache/hadoop/hive/common/io/encoded/MemoryBufferOrBuffers.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/61164/diff/2/

Changes: https://reviews.apache.org/r/61164/diff/1-2/


Testing
---


Thanks,

Sergey Shelukhin



Re: [DISCUSS] Separating out the metastore as its own TLP

2017-07-27 Thread Lefty Leverenz
Johndee (and everyone else), wiki edit privileges are easy to get:  About
This Wiki -- How to get permission to edit

.

-- Lefty


On Thu, Jul 27, 2017 at 2:32 PM, Johndee Cloudera 
wrote:

> Well if I cannot get Metastore, how about Hadoop Metastore it is simple and
> self explanatory to a degree.
>
> @Alan,
>
> Sorry to make the name suggestion here but I could not comment or edit the
> page you created.
>
> On Mon, Jul 24, 2017 at 7:17 PM, Gopal Vijayaraghavan 
> wrote:
>
> > Hi,
> >
> >
> > Changing the name isn't really optional or "being google-able" [2].
> >
> > The naming is a crucial part of trademark protection [1], which is the
> > only protection ASF has against hostile Embrace & Extends.
> >
> > Fragmented forks with the same name is particularly bad, especially if
> the
> > feature in question can be only used by a proprietary tool (like Dain's
> > suggestion about Presto view metadata, except it only works with a
> per-cpu
> > license).
> >
> > The safe path isn't pretty, it still ends up with IcedTea and IceWeasel …
> > but at least, those are clearly weird.
> >
> > Cheers,
> > Gopal
> >
> > [1] - https://en.wikipedia.org/wiki/A_moron_in_a_hurry#United_States
> > [2] - https://packages.debian.org/jessie/misc/metastore
> >
> > On 7/24/17, 3:04 PM, "Carl Steinbach"  wrote:
> >
> > +1 to Vihang's suggestion. Changing the name will only cause
> confusion.
> >
> > On Mon, Jul 24, 2017 at 2:28 PM, Johndee Cloudera <
> > john...@cloudera.com>
> > wrote:
> >
> > > +1 Vihang, I do not really like Catalog as it could create
> confusion
> > with
> > > the Catalog daemon from impala.
> > >
> > > On Mon, Jul 24, 2017 at 5:20 PM, Vihang Karajgaonkar <
> > vih...@cloudera.com>
> > > wrote:
> > >
> > > > Before we see a flood of name suggestions :) Why not just keep it
> > > > Metastore? Its already well-known in the community and easy to
> > relate to.
> > > >
> > > > On Mon, Jul 24, 2017 at 2:13 PM, Alan Gates <
> alanfga...@gmail.com>
> > > wrote:
> > > >
> > > > > In the same vein Carter and Gunther suggested Omegastore.  Pick
> > your
> > > > > alphabet and whether it’s a catalog or a store I guess.
> > > > >
> > > > > Alan.
> > > > >
> > > > > On Mon, Jul 24, 2017 at 1:35 PM, Sergey Shelukhin <
> > > > ser...@hortonworks.com>
> > > > > wrote:
> > > > >
> > > > > > I’d like to suggest ZCatalog.
> > > > > >
> > > > > > On 17/7/11, 15:41, "Lefty Leverenz"  >
> > wrote:
> > > > > >
> > > > > > >>> I'd like to suggest Riven.  (Owen O'Malley)
> > > > > > >
> > > > > > >> How about "Flora"?  (Andrew Sherman)
> > > > > > >
> > > > > > >Nice idea and thanks for introducing me to that book,
> Andrew.
> > > > > > >
> > > > > > >Along the same lines, how about "Honeycomb"?
> > > > > > >
> > > > > > >But since the idea is to make the metastore useful for many
> > > projects,
> > > > a
> > > > > > >generic name that starts with "Meta" would be less confusing
> > ...
> > > even
> > > > > > >though it breaks the tradition of Apache projects having
> > quirky
> > > names.
> > > > > > >Unfortunately "Metalog" is already in use.  "Metamorph" has
> > other
> > > > > > >connotations, but it's cool.
> > > > > > >
> > > > > > >Naming enthusiasm notwithstanding, I'm +/-0 on the idea of
> > splitting
> > > > off
> > > > > > >the metastore into a new project:  -0.5 for the sake of Hive
> > and
> > > +0.5
> > > > > for
> > > > > > >the greater good.  Wishy-washy, that's me.
> > > > > > >
> > > > > > >-- Lefty
> > > > > > >
> > > > > > >
> > > > > > >On Tue, Jul 11, 2017 at 1:04 PM, Andrew Sherman <
> > > > asher...@cloudera.com>
> > > > > > >wrote:
> > > > > > >
> > > > > > >> On Fri, Jun 30, 2017 at 5:05 PM, Owen O'Malley <
> > > > > owen.omal...@gmail.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > On Fri, Jun 30, 2017 at 3:26 PM, Chao Sun <
> > sunc...@apache.org>
> > > > > wrote:
> > > > > > >> >
> > > > > > >> > > and maybe a different project name?
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> > Yes, it certainly needs a new name. I'd like to suggest
> > Riven.
> > > > > > >> >
> > > > > > >> > .. Owen
> > > > > > >> >
> > > > > > >>
> > > > > > >> How about "Flora"?
> > > > > > >>
> > > > > > >> (Flora is the protagonist of The Bees by Laline Paull)
> > > > > > >>
> > > > > > >> -Andrew
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > - JRB
> > >
> >
> >
> >
> >
>
>
> --
> - JRB
>


[jira] [Created] (HIVE-17189) Fix backwards incompatibility in HiveMetaStoreClient

2017-07-27 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-17189:
--

 Summary: Fix backwards incompatibility in HiveMetaStoreClient
 Key: HIVE-17189
 URL: https://issues.apache.org/jira/browse/HIVE-17189
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 2.1.1
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-12730 adds the ability to edit the basic stats using {{alter table}} and 
{{alter partition}} commands. However, it changes the signature of @public 
interface of MetastoreClient and removes some methods which breaks backwards 
compatibility. This can be fixed easily by re-introducing the removed methods 
and making them call into newly added method 
{{alter_table_with_environment_context}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 61087: HIVE-16965 SMB join may produce incorrect results

2017-07-27 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61087/
---

(Updated July 27, 2017, 9:25 p.m.)


Review request for hive, Gopal V, Jason Dere, and Sergey Shelukhin.


Changes
---

Fixed the assert introduced in last rev. to compare the path values instead of 
comparing the path objects.


Bugs: HIVE-16965
https://issues.apache.org/jira/browse/HIVE-16965


Repository: hive-git


Description
---

Usually, in a JOIN with multiple inputs (partitions), the inputs are read 
sequentially, however, incase of SMB join, the inputs are read based on key 
ordering. This invalidates the current IOContext assumption that the input path 
once set wont change unless the input changes.
This was resulting in incorrect partition information in results as it is 
derived from the input path in IOContext.
The new logic changes the input path as and when input changes.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordSource.java 
add7d08c40 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValueInputMerger.java 
698fa7f69e 
  ql/src/test/results/clientpositive/llap/llap_smb.q.out 87b33db805 


Diff: https://reviews.apache.org/r/61087/diff/5/

Changes: https://reviews.apache.org/r/61087/diff/4-5/


Testing
---

Added a new test.


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-17188) ObjectStore runs out of memory for large batches of addPartitions().

2017-07-27 Thread Mithun Radhakrishnan (JIRA)
Mithun Radhakrishnan created HIVE-17188:
---

 Summary: ObjectStore runs out of memory for large batches of 
addPartitions().
 Key: HIVE-17188
 URL: https://issues.apache.org/jira/browse/HIVE-17188
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 2.2.0
Reporter: Mithun Radhakrishnan
Assignee: Chris Drome


For large batches (e.g. hundreds) of {{addPartitions()}}, the {{ObjectStore}} 
runs out of memory. Flushing the {{PersistenceManager}} alleviates the problem.

(Raising this on behalf of [~cdrome] and [~thiruvel].)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 61087: HIVE-16965 SMB join may produce incorrect results

2017-07-27 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61087/
---

(Updated July 27, 2017, 8:37 p.m.)


Review request for hive, Gopal V, Jason Dere, and Sergey Shelukhin.


Changes
---

Added a better assert to verify uniqueness of the paths for a given input.


Bugs: HIVE-16965
https://issues.apache.org/jira/browse/HIVE-16965


Repository: hive-git


Description
---

Usually, in a JOIN with multiple inputs (partitions), the inputs are read 
sequentially, however, incase of SMB join, the inputs are read based on key 
ordering. This invalidates the current IOContext assumption that the input path 
once set wont change unless the input changes.
This was resulting in incorrect partition information in results as it is 
derived from the input path in IOContext.
The new logic changes the input path as and when input changes.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordSource.java 
add7d08c40 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValueInputMerger.java 
698fa7f69e 
  ql/src/test/results/clientpositive/llap/llap_smb.q.out 87b33db805 


Diff: https://reviews.apache.org/r/61087/diff/4/

Changes: https://reviews.apache.org/r/61087/diff/3-4/


Testing
---

Added a new test.


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-17187) WebHCat SPNEGO support is incompleted

2017-07-27 Thread Eric Yang (JIRA)
Eric Yang created HIVE-17187:


 Summary: WebHCat SPNEGO support is incompleted
 Key: HIVE-17187
 URL: https://issues.apache.org/jira/browse/HIVE-17187
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 1.2.1
Reporter: Eric Yang


[Some online 
document|https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_security/content/spnego_setup_for_webhcat.html]
 describes how to setup WebHCat with SPNEGO support.  However, there could be 
multiple services use SPNEGO on the same host.  For example, HBase REST API can 
also setup to use HTTP principal for SPNEGO support.  When HTTP principal is 
shared among other services, Hadoop proxy user settings can not identify the 
origin of doAs call with HTTP principal, is invoked by HBase REST API or 
WebHCat.  Ideally, WebHCat should keep track of its own service principal 
independent of SPNEGO principal to ensure that SPNEGO principal is only given 
authentication access.  SPNEGO principal should not be used in proxy user 
setting to grant authorization access.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Separating out the metastore as its own TLP

2017-07-27 Thread Johndee Cloudera
Well if I cannot get Metastore, how about Hadoop Metastore it is simple and
self explanatory to a degree.

@Alan,

Sorry to make the name suggestion here but I could not comment or edit the
page you created.

On Mon, Jul 24, 2017 at 7:17 PM, Gopal Vijayaraghavan 
wrote:

> Hi,
>
>
> Changing the name isn't really optional or "being google-able" [2].
>
> The naming is a crucial part of trademark protection [1], which is the
> only protection ASF has against hostile Embrace & Extends.
>
> Fragmented forks with the same name is particularly bad, especially if the
> feature in question can be only used by a proprietary tool (like Dain's
> suggestion about Presto view metadata, except it only works with a per-cpu
> license).
>
> The safe path isn't pretty, it still ends up with IcedTea and IceWeasel …
> but at least, those are clearly weird.
>
> Cheers,
> Gopal
>
> [1] - https://en.wikipedia.org/wiki/A_moron_in_a_hurry#United_States
> [2] - https://packages.debian.org/jessie/misc/metastore
>
> On 7/24/17, 3:04 PM, "Carl Steinbach"  wrote:
>
> +1 to Vihang's suggestion. Changing the name will only cause confusion.
>
> On Mon, Jul 24, 2017 at 2:28 PM, Johndee Cloudera <
> john...@cloudera.com>
> wrote:
>
> > +1 Vihang, I do not really like Catalog as it could create confusion
> with
> > the Catalog daemon from impala.
> >
> > On Mon, Jul 24, 2017 at 5:20 PM, Vihang Karajgaonkar <
> vih...@cloudera.com>
> > wrote:
> >
> > > Before we see a flood of name suggestions :) Why not just keep it
> > > Metastore? Its already well-known in the community and easy to
> relate to.
> > >
> > > On Mon, Jul 24, 2017 at 2:13 PM, Alan Gates 
> > wrote:
> > >
> > > > In the same vein Carter and Gunther suggested Omegastore.  Pick
> your
> > > > alphabet and whether it’s a catalog or a store I guess.
> > > >
> > > > Alan.
> > > >
> > > > On Mon, Jul 24, 2017 at 1:35 PM, Sergey Shelukhin <
> > > ser...@hortonworks.com>
> > > > wrote:
> > > >
> > > > > I’d like to suggest ZCatalog.
> > > > >
> > > > > On 17/7/11, 15:41, "Lefty Leverenz" 
> wrote:
> > > > >
> > > > > >>> I'd like to suggest Riven.  (Owen O'Malley)
> > > > > >
> > > > > >> How about "Flora"?  (Andrew Sherman)
> > > > > >
> > > > > >Nice idea and thanks for introducing me to that book, Andrew.
> > > > > >
> > > > > >Along the same lines, how about "Honeycomb"?
> > > > > >
> > > > > >But since the idea is to make the metastore useful for many
> > projects,
> > > a
> > > > > >generic name that starts with "Meta" would be less confusing
> ...
> > even
> > > > > >though it breaks the tradition of Apache projects having
> quirky
> > names.
> > > > > >Unfortunately "Metalog" is already in use.  "Metamorph" has
> other
> > > > > >connotations, but it's cool.
> > > > > >
> > > > > >Naming enthusiasm notwithstanding, I'm +/-0 on the idea of
> splitting
> > > off
> > > > > >the metastore into a new project:  -0.5 for the sake of Hive
> and
> > +0.5
> > > > for
> > > > > >the greater good.  Wishy-washy, that's me.
> > > > > >
> > > > > >-- Lefty
> > > > > >
> > > > > >
> > > > > >On Tue, Jul 11, 2017 at 1:04 PM, Andrew Sherman <
> > > asher...@cloudera.com>
> > > > > >wrote:
> > > > > >
> > > > > >> On Fri, Jun 30, 2017 at 5:05 PM, Owen O'Malley <
> > > > owen.omal...@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > On Fri, Jun 30, 2017 at 3:26 PM, Chao Sun <
> sunc...@apache.org>
> > > > wrote:
> > > > > >> >
> > > > > >> > > and maybe a different project name?
> > > > > >> > >
> > > > > >> >
> > > > > >> > Yes, it certainly needs a new name. I'd like to suggest
> Riven.
> > > > > >> >
> > > > > >> > .. Owen
> > > > > >> >
> > > > > >>
> > > > > >> How about "Flora"?
> > > > > >>
> > > > > >> (Flora is the protagonist of The Bees by Laline Paull)
> > > > > >>
> > > > > >> -Andrew
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > - JRB
> >
>
>
>
>


-- 
- JRB


[GitHub] hive pull request #211: HIVE-17167 Create metastore specific configuration t...

2017-07-27 Thread alanfgates
GitHub user alanfgates opened a pull request:

https://github.com/apache/hive/pull/211

HIVE-17167 Create metastore specific configuration tool



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alanfgates/hive hive17167

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/211.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #211






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Review Request 61188: HIVE-16614

2017-07-27 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61188/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-16614
https://issues.apache.org/jira/browse/HIVE-16614


Repository: hive-git


Description
---

HIVE-16614


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/type/TimestampTZ.java 
ed83871a4a049a3e1a8417fb68bd1a1e66026a4e 
  common/src/java/org/apache/hadoop/hive/common/type/TimestampTZUtil.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
05f6cc95927c130100df01708a3fd186e1cfd116 
  common/src/test/org/apache/hadoop/hive/common/type/TestTimestampTZ.java 
0cef77a9cee263b98495c4d139d978364d9320b0 
  jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java 
6742423ff509c8098ad821540ff778a0db2dd6a7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java 
8902f6c2db72d9804f34b9006d867d85d54ee916 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
 517ce312a7783d32b1a9fe91f22ac2293279c112 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveType.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ExprNodeConverter.java
 f974cc9195772e09b5d09b4da6adf2919ceb529f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java
 7665f56cfccf6c92e0fb7d03e475a4db6822f9e0 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/SqlFunctionConverter.java
 c6b34d46c44591a7d6e06598426414f13589359f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/TypeConverter.java
 2df7588cba01399ae980059a7b2f447a9973f2e6 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
5b7fc25417e0a21833b47f4fb399c1a780642a13 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
632b9c62cfeb34e9e1cad17ad5f522fe06b882b0 
  ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java 
1458211b81514963975269ea0dcd33276d851c28 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToTimestampTZ.java 
e96012bbf2babc3b9f200a61862540fbf5ded999 
  ql/src/test/queries/clientpositive/timestamptz_2.q 
a335f529d7bff9e2495ff10fd38d30e2d77bb909 
  ql/src/test/queries/clientpositive/timezone.q PRE-CREATION 
  ql/src/test/results/clientpositive/annotate_stats_select.q.out 
67d134ba4a9cd8db12513d7de8bb9d03b52a4f3d 
  ql/src/test/results/clientpositive/constantfolding.q.out 
10e185f00b8a8f848122e604906a72dd54ee9d3d 
  ql/src/test/results/clientpositive/timestamptz.q.out 
626fe92286651560a7fb0d6dca98584d49fb671f 
  ql/src/test/results/clientpositive/timestamptz_1.q.out 
75bbfac3e816e3979a72726b696e88e98d7d1e73 
  ql/src/test/results/clientpositive/timestamptz_2.q.out 
2666735fbc4660723598c6ef70e5913185118328 
  ql/src/test/results/clientpositive/timezone.q.out PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java
 f333ae9938b5f17925c1e37024e7d3b85037a990 
  serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampTZWritable.java 
8c3f8f647152c4447ef0ee5255d6b8b4727406d0 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySerDeParameters.java 
ee4bb345cee4d0bf6f7952c5e8549ee51059a0ec 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyTimestampTZ.java 
df5c586f56274f722f8779afb4bb6262b61d5f02 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java
 6d1ee1e97bb02a53774d272a22b493fa65136800 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyTimestampTZObjectInspector.java
 7336385a7aea4187d729694437fb32b6f88229a5 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryTimestampTZ.java
 6d9ca6e93781f9aafca91be203bed7c4c078e76a 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java
 ca96e33600bedac8363b3a78e36898a8297c410b 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaTimestampTZObjectInspector.java
 32b9c69909488dc4e32cf4ba84d7b97881c4d926 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
 d4b7a32bcaebb4b8402286b970a3fe388f31474a 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorFactory.java
 2425c30f012fd536245f6e333891decd0abf98c6 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 886c29885241ecf55532e030120e23f9fc19145c 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantTimestampTZObjectInspector.java
 5805ce8b5e1512d43463017a4fba5be3fa496820 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableTimestampTZObjectInspector.java
 

[jira] [Created] (HIVE-17186) `double` type constant operation loses precision

2017-07-27 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created HIVE-17186:


 Summary: `double` type constant operation loses precision
 Key: HIVE-17186
 URL: https://issues.apache.org/jira/browse/HIVE-17186
 Project: Hive
  Issue Type: Bug
Reporter: Dongjoon Hyun


This might be an issue where Hive loses a precision and generates a wrong 
result when handling *double* constant operations. This was reported in the 
following environment.

*ENVIRONMENT*
https://github.com/hortonworks/hive-testbench/blob/hive14/tpch-gen/ddl/orc.sql

*SQL*
{code}
hive> explain select l_discount from lineitem where l_discount between 0.06 - 
0.01 and 0.06 + 0.01 limit 10;
OK
Plan not optimized by CBO.

Stage-0
   Fetch Operator
  limit:10
  Stage-1
 Map 1 vectorized
 File Output Operator [FS_9]
compressed:false
Statistics:Num rows: 10 Data size: 80 Basic stats: COMPLETE Column 
stats: COMPLETE
table:{"input 
format:":"org.apache.hadoop.mapred.TextInputFormat","output 
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}
Limit [LIM_8]
   Number of rows:10
   Statistics:Num rows: 10 Data size: 80 Basic stats: COMPLETE 
Column stats: COMPLETE
   Select Operator [OP_7]
  outputColumnNames:["_col0"]
  Statistics:Num rows: 294854 Data size: 2358832 Basic 
stats: COMPLETE Column stats: COMPLETE
  Filter Operator [FIL_6]
 predicate:l_discount BETWEEN 0.049996 AND 
0.06999 (type: boolean)
 Statistics:Num rows: 294854 Data size: 2358832 
Basic stats: COMPLETE Column stats: COMPLETE
 TableScan [TS_0]
alias:lineitem
Statistics:Num rows: 589709 Data size: 
4832986297043 Basic stats: COMPLETE Column stats: COMPLETE

hive> select max(l_discount) from lineitem where l_discount between 0.06 - 0.01 
and 0.06 + 0.01 limit 10;
OK
0.06
Time taken: 314.923 seconds, Fetched: 1 row(s)
{code}

Hive excludes 0.07 differently from the users' intuitiion. Also, this 
difference makes some users confused because they believe that Hive's result is 
the correct one. Is there any way for Hive to fix this?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17185) TestHiveMetaStoreStatsMerge.testStatsMerge is failing

2017-07-27 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-17185:
---

 Summary: TestHiveMetaStoreStatsMerge.testStatsMerge is failing
 Key: HIVE-17185
 URL: https://issues.apache.org/jira/browse/HIVE-17185
 Project: Hive
  Issue Type: Test
  Components: Metastore, Test
Affects Versions: 3.0.0
Reporter: Ashutosh Chauhan


Likely because of HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17184) Unexpected new line in beeline when running with -f option

2017-07-27 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-17184:
--

 Summary: Unexpected new line in beeline when running with -f option
 Key: HIVE-17184
 URL: https://issues.apache.org/jira/browse/HIVE-17184
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar
Priority: Minor


When running in -f mode on BeeLine I see an extra new line getting added at the 
end of the results.

{noformat}
vihang-MBP:bin vihang$ beeline -f /tmp/query.sql 2>/dev/null
+--+---+
| test.id  | test.val  |
+--+---+
| 1| one   |
| 2| two   |
| 1| three |
+--+---+

vihang-MBP:bin vihang$ beeline -e "select * from test;" 2>/dev/null
+--+---+
| test.id  | test.val  |
+--+---+
| 1| one   |
| 2| two   |
| 1| three |
+--+---+
vihang-MBP:bin vihang$
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Separating out the metastore as its own TLP

2017-07-27 Thread Alan Gates
I think the concerns with Metastore are twofold.  One, it’s a common term
in software, and thus would be hard to defend as a trademark.  Hence
Gopal’s link to the Debian package already named metastore.  IANAL but as I
understand trademark law the test is whether a name could cause a
reasonable person to be confused as to who is offering a good or service.
So a McDonalds tire store is ok, a reasonable person knows McDonalds sells
food, not tires, but a restaurant named McDonnies that serves hamburgers
isn’t.  Whether IceWeasal (or any of the suggested names besides metastore)
passes that test I don’t know.  The board will require us to go through a
namesearch as part of the TLP process.

If I understand Gopal’s second point it is that it will cause confusion for
users as to whether this is still part of Hive or something separate.  I
think stressing the continuity is exactly what Vihang and Carl like about
keeping the name.

My suggestion would be that the project needs a name other than metastore,
but we can call the module metastore.  Having a more unique name is good
for trademarks and helping users find your stuff via google, etc.  (Go to
Google and search on “Hive" to see what I mean here.)  If we pick X as the
project name, we can then refer to it as the X metastore, the maven modules
can be x-metastore, etc.  I think this strikes the balance between the
benefits of a unique-ish name and telling users what it does and where it
came from.

Plus, IceWeasal is a _way_ cooler name for a project that metastore. :)

Alan.

On Tue, Jul 25, 2017 at 4:45 PM, Carl Steinbach  wrote:

> "IceWeasel" and "MetaStore" are both examples of English compound words.
> What exactly makes the former any safer than the latter?
>
> On Mon, Jul 24, 2017 at 4:17 PM, Gopal Vijayaraghavan 
> wrote:
>
> > Hi,
> >
> >
> > Changing the name isn't really optional or "being google-able" [2].
> >
> > The naming is a crucial part of trademark protection [1], which is the
> > only protection ASF has against hostile Embrace & Extends.
> >
> > Fragmented forks with the same name is particularly bad, especially if
> the
> > feature in question can be only used by a proprietary tool (like Dain's
> > suggestion about Presto view metadata, except it only works with a
> per-cpu
> > license).
> >
> > The safe path isn't pretty, it still ends up with IcedTea and IceWeasel …
> > but at least, those are clearly weird.
> >
> > Cheers,
> > Gopal
> >
> > [1] - https://en.wikipedia.org/wiki/A_moron_in_a_hurry#United_States
> > [2] - https://packages.debian.org/jessie/misc/metastore
> >
> > On 7/24/17, 3:04 PM, "Carl Steinbach"  wrote:
> >
> > +1 to Vihang's suggestion. Changing the name will only cause
> confusion.
> >
> > On Mon, Jul 24, 2017 at 2:28 PM, Johndee Cloudera <
> > john...@cloudera.com>
> > wrote:
> >
> > > +1 Vihang, I do not really like Catalog as it could create
> confusion
> > with
> > > the Catalog daemon from impala.
> > >
> > > On Mon, Jul 24, 2017 at 5:20 PM, Vihang Karajgaonkar <
> > vih...@cloudera.com>
> > > wrote:
> > >
> > > > Before we see a flood of name suggestions :) Why not just keep it
> > > > Metastore? Its already well-known in the community and easy to
> > relate to.
> > > >
> > > > On Mon, Jul 24, 2017 at 2:13 PM, Alan Gates <
> alanfga...@gmail.com>
> > > wrote:
> > > >
> > > > > In the same vein Carter and Gunther suggested Omegastore.  Pick
> > your
> > > > > alphabet and whether it’s a catalog or a store I guess.
> > > > >
> > > > > Alan.
> > > > >
> > > > > On Mon, Jul 24, 2017 at 1:35 PM, Sergey Shelukhin <
> > > > ser...@hortonworks.com>
> > > > > wrote:
> > > > >
> > > > > > I’d like to suggest ZCatalog.
> > > > > >
> > > > > > On 17/7/11, 15:41, "Lefty Leverenz"  >
> > wrote:
> > > > > >
> > > > > > >>> I'd like to suggest Riven.  (Owen O'Malley)
> > > > > > >
> > > > > > >> How about "Flora"?  (Andrew Sherman)
> > > > > > >
> > > > > > >Nice idea and thanks for introducing me to that book,
> Andrew.
> > > > > > >
> > > > > > >Along the same lines, how about "Honeycomb"?
> > > > > > >
> > > > > > >But since the idea is to make the metastore useful for many
> > > projects,
> > > > a
> > > > > > >generic name that starts with "Meta" would be less confusing
> > ...
> > > even
> > > > > > >though it breaks the tradition of Apache projects having
> > quirky
> > > names.
> > > > > > >Unfortunately "Metalog" is already in use.  "Metamorph" has
> > other
> > > > > > >connotations, but it's cool.
> > > > > > >
> > > > > > >Naming enthusiasm notwithstanding, I'm +/-0 on the idea of
> > splitting
> > > > off
> > > > > > >the metastore into a new project:  -0.5 for the sake of Hive
> > and
> > > +0.5
> > > > > for
> > > > > > 

[jira] [Created] (HIVE-17183) Disable rename operations during bootstrap dump

2017-07-27 Thread Sankar Hariappan (JIRA)
Sankar Hariappan created HIVE-17183:
---

 Summary: Disable rename operations during bootstrap dump
 Key: HIVE-17183
 URL: https://issues.apache.org/jira/browse/HIVE-17183
 Project: Hive
  Issue Type: Sub-task
  Components: repl
Affects Versions: 2.1.0
Reporter: Sankar Hariappan
Assignee: Sankar Hariappan
 Fix For: 3.0.0


Currently, bootstrap dump shall lead to data loss when any rename happens while 
dump in progress. This feature can be supported in next phase development as it 
need proper design to keep track of renamed tables/partitions. 
So, for time being, we shall disable rename operations when bootstrap dump in 
progress to avoid any inconsistent state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17182) Invalid statistics like "RAW DATA SIZE" info for parquet file

2017-07-27 Thread liyunzhang_intel (JIRA)
liyunzhang_intel created HIVE-17182:
---

 Summary: Invalid statistics like "RAW DATA SIZE" info for parquet 
file
 Key: HIVE-17182
 URL: https://issues.apache.org/jira/browse/HIVE-17182
 Project: Hive
  Issue Type: Bug
Reporter: liyunzhang_intel


on TPC-DS 200g scale store_sales
use "describe formatted store_sales" to view the statistics
{code}
hive> describe formatted store_sales;
OK
# col_name  data_type   comment 
 
ss_sold_time_sk bigint  
ss_item_sk  bigint  
ss_customer_sk  bigint  
ss_cdemo_sk bigint  
ss_hdemo_sk bigint  
ss_addr_sk  bigint  
ss_store_sk bigint  
ss_promo_sk bigint  
ss_ticket_numberbigint  
ss_quantity int 
ss_wholesale_cost   double  
ss_list_price   double  
ss_sales_price  double  
ss_ext_discount_amt double  
ss_ext_sales_price  double  
ss_ext_wholesale_cost   double  
ss_ext_list_price   double  
ss_ext_tax  double  
ss_coupon_amt   double  
ss_net_paid double  
ss_net_paid_inc_tax double  
ss_net_profit   double  
 
# Partition Information  
# col_name  data_type   comment 
 
ss_sold_date_sk bigint  
 
# Detailed Table Information 
Database:   tpcds_bin_partitioned_parquet_200
Owner:  root 
CreateTime: Tue Jun 06 11:51:48 CST 2017 
LastAccessTime: UNKNOWN  
Retention:  0
Location:   
hdfs://bdpe38:9000/user/hive/warehouse/tpcds_bin_partitioned_parquet_200.db/store_sales
  
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles2023
numPartitions   1824
numRows 575995635   
rawDataSize 12671903970 
totalSize   46465926745 
transient_lastDdlTime   1496721108  
{code}
the rawDataSize is nearly 12G while the totalSize is nearly 46G.
view the original data on hdfs
{format}
#hadoop fs -du -h /tmp/tpcds-generate/200/
75.8 G   /tmp/tpcds-generate/200/store_sales
{format} 
view the parquet file on hdfs
{format}
# hadoop fs -du -h /user/hive/warehouse/tpcds_bin_partitioned_parquet_200.db
43.3 G   /user/hive/warehouse/tpcds_bin_partitioned_parquet_200.db/store_sales
{format}

It seems that the rawDataSize is nearly 75G but in "describe formatted 
store_sales" command, it shows only 12G.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)