[jira] [Created] (HIVE-8625) Some union queries result in plans with many unions with CBO on

2014-10-27 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-8625:
-

 Summary: Some union queries result in plans with many unions with 
CBO on
 Key: HIVE-8625
 URL: https://issues.apache.org/jira/browse/HIVE-8625
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesus Camacho Rodriguez
Priority: Minor


For some queries e.g. union9, union16, we are getting plans with many binary 
unions when CBO is on. The reason is that an identity select operator is 
introduced after each union by the CBO; thus, the method that was merging the 
unions when the translation of the AST into a logical plan was being done is 
not executed, as it does not detect there is another union operator. 

With the fix in the patch attached, the method will merge the two unions if (1) 
one is the child of the other one, or (2) if one is a descendant of the other 
one and there is an identity select operator between them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8625) Some union queries result in plans with many unions with CBO on

2014-10-27 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-8625:
--
Attachment: HIVE-8625.patch

> Some union queries result in plans with many unions with CBO on
> ---
>
> Key: HIVE-8625
> URL: https://issues.apache.org/jira/browse/HIVE-8625
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-8625.patch
>
>
> For some queries e.g. union9, union16, we are getting plans with many binary 
> unions when CBO is on. The reason is that an identity select operator is 
> introduced after each union by the CBO; thus, the method that was merging the 
> unions when the translation of the AST into a logical plan was being done is 
> not executed, as it does not detect there is another union operator. 
> With the fix in the patch attached, the method will merge the two unions if 
> (1) one is the child of the other one, or (2) if one is a descendant of the 
> other one and there is an identity select operator between them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8625) Some union queries result in plans with many unions with CBO on

2014-10-27 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-8625:
--
Status: Patch Available  (was: Open)

> Some union queries result in plans with many unions with CBO on
> ---
>
> Key: HIVE-8625
> URL: https://issues.apache.org/jira/browse/HIVE-8625
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-8625.patch
>
>
> For some queries e.g. union9, union16, we are getting plans with many binary 
> unions when CBO is on. The reason is that an identity select operator is 
> introduced after each union by the CBO; thus, the method that was merging the 
> unions when the translation of the AST into a logical plan was being done is 
> not executed, as it does not detect there is another union operator. 
> With the fix in the patch attached, the method will merge the two unions if 
> (1) one is the child of the other one, or (2) if one is a descendant of the 
> other one and there is an identity select operator between them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19771:
--

 Summary: allowNullColumnForMissingStats should not be false when 
column stats are estimated
 Key: HIVE-19771
 URL: https://issues.apache.org/jira/browse/HIVE-19771
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 3.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Otherwise we may throw an Exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19773) CBO exception while running queries with tables that are not present in materialized views

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19773:
--

 Summary: CBO exception while running queries with tables that are 
not present in materialized views
 Key: HIVE-19773
 URL: https://issues.apache.org/jira/browse/HIVE-19773
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Affects Versions: 3.1.0, 4.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


When we obtain the valid list of write ids, some tables in the materialized 
views may not be present in the list because they are not present in the query, 
which leads to exceptions (hidden in logs) when we try to load the materialized 
views in the planner, as we need to verify whether they are outdated or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19859) Inspect lock components for DBHiveLock while verifying whether transaction list is valid

2018-06-11 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19859:
--

 Summary: Inspect lock components for DBHiveLock while verifying 
whether transaction list is valid
 Key: HIVE-19859
 URL: https://issues.apache.org/jira/browse/HIVE-19859
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19876) Driver.isValidTxnListState should rely on global txn list

2018-06-12 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19876:
--

 Summary: Driver.isValidTxnListState should rely on global txn list
 Key: HIVE-19876
 URL: https://issues.apache.org/jira/browse/HIVE-19876
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 4.0.0
Reporter: Eugene Koifman
Assignee: Jesus Camacho Rodriguez


When it calls {{ValidTxnList currentTxnList = queryTxnMgr.getValidTxns();}}, it 
will not see anything above its current txnid. It should rely on global valid 
txn list instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19884) Invalidation cache may throw NPE when there is no data in table used by materialized view

2018-06-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19884:
--

 Summary: Invalidation cache may throw NPE when there is no data in 
table used by materialized view
 Key: HIVE-19884
 URL: https://issues.apache.org/jira/browse/HIVE-19884
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19893) TxnIdUtils.checkEquivalentWriteIds may return false negatives

2018-06-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19893:
--

 Summary: TxnIdUtils.checkEquivalentWriteIds may return false 
negatives
 Key: HIVE-19893
 URL: https://issues.apache.org/jira/browse/HIVE-19893
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: Jesus Camacho Rodriguez


Observed while working on HIVE-19876.

Following two lists are equivalent:
{noformat}
2:2:2: (hwm:2, minOpenId:2, openTxns:2)
2:9223372036854775807:: (hwm:2, minOpenId:none)
{noformat}

However, {{checkEquivalentWriteIds}} will return false, i.e., not equivalent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19907) Driver.isValidTxnListState should rely on global txn list

2018-06-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19907:
--

 Summary: Driver.isValidTxnListState should rely on global txn list
 Key: HIVE-19907
 URL: https://issues.apache.org/jira/browse/HIVE-19907
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.0, 4.0.0
Reporter: Eugene Koifman
Assignee: Jesus Camacho Rodriguez


When it calls {{ValidTxnList currentTxnList = queryTxnMgr.getValidTxns();}}, it 
will not see anything above its current txnid. It should rely on global valid 
txn list instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19949) Clean up logic to check locks in Driver.isValidTxnListState

2018-06-20 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19949:
--

 Summary: Clean up logic to check locks in 
Driver.isValidTxnListState
 Key: HIVE-19949
 URL: https://issues.apache.org/jira/browse/HIVE-19949
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.0, 4.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Follow-up for HIVE-19876.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19973) Enable materialized view rewriting by default

2018-06-22 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19973:
--

 Summary: Enable materialized view rewriting by default
 Key: HIVE-19973
 URL: https://issues.apache.org/jira/browse/HIVE-19973
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Change property value for {{hive.materializedview.rewriting}} to {{true}}. For 
tests, it is already {{true}} by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19974) Show tables statement includes views and materialized views

2018-06-22 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19974:
--

 Summary: Show tables statement includes views and materialized 
views
 Key: HIVE-19974
 URL: https://issues.apache.org/jira/browse/HIVE-19974
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Probably it would be more logical to show only the tables, since there exist 
'show views' and 'show materialized views' statements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19997) Batches for TestMiniDruidCliDriver

2018-06-26 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19997:
--

 Summary: Batches for TestMiniDruidCliDriver
 Key: HIVE-19997
 URL: https://issues.apache.org/jira/browse/HIVE-19997
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


I have observed {{TestMiniDruidCliDriver}} takes a long time to execute. I 
verified that execution is not batched. We could batch tests as we do with 
{{TestHBaseCliDriver}}, i.e., 5 q files per batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20006) materializations invalidation cache work with multiple active remote metastores

2018-06-26 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20006:
--

 Summary: materializations invalidation cache work with multiple 
active remote metastores
 Key: HIVE-20006
 URL: https://issues.apache.org/jira/browse/HIVE-20006
 Project: Hive
  Issue Type: Improvement
  Components: Materialized views
Affects Versions: 3.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, 
HIVE-19027.03.patch, HIVE-19027.04.patch

The main points:
 - Only MVs stored in transactional tables can have a time window value of 0. 
Those are the only MVs that can be guaranteed to not be outdated when a query 
is executed, if we use custom storage handlers to store the materialized view, 
we cannot make any promises.
 - For MVs that +cannot be outdated+, we do not check the metastore. Instead, 
comparison is based on valid write id lists.
 - For MVs that +can be outdated+, we still rely on the invalidation cache.
 ** The window for valid outdated MVs can be specified in intervals of 1 minute 
(less than that, it is difficult to have any guarantees about whether the MV is 
actually outdated by less than a minute or not).
 ** The async loading is done every interval / 2 (or probably better, we can 
make it configurable).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20007) Hive should carry out timestamp computations in UTC

2018-06-27 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20007:
--

 Summary: Hive should carry out timestamp computations in UTC
 Key: HIVE-20007
 URL: https://issues.apache.org/jira/browse/HIVE-20007
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Reporter: Ryan Blue
Assignee: Jesus Camacho Rodriguez
 Fix For: 3.1.0


Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
{{Timestamp#getYear()}} and similar methods to implement SQL functions like 
{{year}}.

When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
that alternates between PST and PDT, there are times that cannot be represented 
because the effective zone skips them.

{code}
hive> select TIMESTAMP '2015-03-08 02:10:00.101';
2015-03-08 03:10:00.101
{code}

Using UTC instead of the SQL session time zone as the underlying zone for a 
java.sql.Timestamp avoids this bug, while still returning correct values for 
{{getYear}} etc. Using UTC as the convenience representation (timestamp without 
time zone has no real zone) would make timestamp calculations more consistent 
and avoid similar problems in the future.

Notably, this would break the {{unix_timestamp}} UDF that specifies the result 
is with respect to ["the default timezone and default 
locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
 That function would need to be updated to use the 
{{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20073) Additional tests for to_utc_timestamp function based on HIVE-20068

2018-07-03 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20073:
--

 Summary: Additional tests for to_utc_timestamp function based on 
HIVE-20068
 Key: HIVE-20073
 URL: https://issues.apache.org/jira/browse/HIVE-20073
 Project: Hive
  Issue Type: Bug
 Environment: MapR running on Linux I believe.  Client is DBeaver on 
Windows 7.
Reporter: JAMES J STEINBUGL
 Attachments: image-2018-07-03-08-50-42-390.png

I have the following script and I'm at loss to explain the behavior.  Possibly 
it's an older bug as we are using the 2.1.1 drivers (?).  We noticed this issue 
when converting from US/Eastern into UTC and then back to US/Eastern.  
Everything that was in Status Date / Status Hour on 3/11/17 21:00:00 shifted 6 
hours ahead into UTC ... then shifted back to 3/11/17 22:00:00 back in 
US/Eastern.  The behavior appears to be the same using the constant EST5EDT.  
EDT was effective on 3/12 2 am, so the issue appears only at this boundary 
condition when we "spring ahead", but it at least on the surface seems 
incorrect.

--
-- Potential Issue with to_utc_timestamp
---

SELECT '2017-03-11 18:00:00', to_utc_timestamp(timestamp '2017-03-11 
18:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

SELECT '2017-03-11 19:00:00', to_utc_timestamp(timestamp '2017-03-11 
19:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

SELECT '2017-03-11 20:00:00', to_utc_timestamp(timestamp '2017-03-11 
20:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

{color:#FF}SELECT '2017-03-11 21:00:00', to_utc_timestamp(timestamp 
'2017-03-11 21:00:00','US/Eastern'); -- Shifts ahead 6 hours (???){color}

{color:#FF}_c0                                   _c1
2017-03-11 21:00:00       2017-03-12 03:00:00{color}

SELECT '2017-03-11 22:00:00', to_utc_timestamp(timestamp '2017-03-11 
22:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

SELECT '2017-03-11 23:00:00', to_utc_timestamp(timestamp '2017-03-11 
23:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

SELECT '2017-03-12 00:00:00', to_utc_timestamp(timestamp '2017-03-12 
00:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

SELECT '2017-03-12 01:00:00', to_utc_timestamp(timestamp '2017-03-12 
01:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

SELECT '2017-03-12 02:00:00', to_utc_timestamp(timestamp '2017-03-12 
02:00:00','US/Eastern'); -- Shifts ahead 5 hours as expected

SELECT '2017-03-12 03:00:00', to_utc_timestamp(timestamp '2017-03-12 
03:00:00','US/Eastern'); -- Shifts ahead 4 hours as expected

SELECT '2017-03-12 04:00:00', to_utc_timestamp(timestamp '2017-03-12 
04:00:00','US/Eastern'); -- Shifts ahead 4 hours as expected

SELECT '2017-03-12 05:00:00', to_utc_timestamp(timestamp '2017-03-12 
05:00:00','US/Eastern'); -- Shifts ahead 4 hours as expected

!image-2018-07-03-08-50-42-390.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities

2018-07-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20090:
--

 Summary: Extend creation of semijoin reduction filters to be able 
to discover new opportunities
 Key: HIVE-20090
 URL: https://issues.apache.org/jira/browse/HIVE-20090
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Assume the following plan:
{noformat}
TS[0] - RS[1] - JOIN[4] - RS[5] - JOIN[8] - FS[9]
TS[2] - RS[3] - JOIN[4] 
TS[6] - RS[7] - JOIN[8]
{noformat}

Currently, {{TS\[6\]}} may only be reduced with the output of {{RS\[5\]}}, 
i.e., input to join between both subplans.
However, it may be useful to consider other possibilities too, e.g., reduced by 
the output of {{RS\[1\]}} or {{RS\[3\]}}. For instance, this is important when, 
given a large plan, an edge between {{RS[5]}} and {{TS[0]}} would create a 
cycle, while an edge between {{RS[1]}} and {{TS[6]}} would not.

This patch comprises two parts. First, it creates additional predicates when 
possible. Secondly, it removes duplicate semijoin reduction 
branches/predicates, e.g., if another semijoin that consumes the output of the 
same expression already reduces a certain table scan operator (heuristic, since 
this may not result in most efficient plan in all cases). Ultimately, the 
decision on whether to use one or another should be cost-driven (follow-up).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20102) Add a couple of additional tests for query parsing

2018-07-05 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20102:
--

 Summary: Add a couple of additional tests for query parsing
 Key: HIVE-20102
 URL: https://issues.apache.org/jira/browse/HIVE-20102
 Project: Hive
  Issue Type: Improvement
  Components: Parser
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional

2018-07-06 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20111:
--

 Summary: HBase-Hive (managed) table creation fails with strict 
managed table checks: Table is marked as a managed table but is not 
transactional
 Key: HIVE-20111
 URL: https://issues.apache.org/jira/browse/HIVE-20111
 Project: Hive
  Issue Type: Bug
  Components: Hive, StorageHandler
Affects Versions: 3.0.0
Reporter: Dileep Kumar Chiguruvada
Assignee: Nishant Bangarwa
 Fix For: 3.0.0


Druid-Hive (managed) table creation fails with strict managed table checks: 
Table is marked as a managed table but is not transactional

{code}
drop table if exists calcs;
create table calcs
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES (
"druid.segment.granularity" = "MONTH",
"druid.query.granularity" = "DAY")
AS SELECT
cast(datetime0 as timestamp with local time zone) `__time`,
key,
str0, str1, str2, str3,
date0, date1, date2, date3,
time0, time1,
datetime0, datetime1,
zzz,
cast(bool0 as string) bool0,
cast(bool1 as string) bool1,
cast(bool2 as string) bool2,
cast(bool3 as string) bool3,
int0, int1, int2, int3,
num0, num1, num2, num3, num4
from tableau_orc.calcs;

2018-07-03 04:57:31,911|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Status: Running 
(Executing on YARN cluster with App id application_1530592209763_0009)
...
...
2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_BYTES_TO_MEM: 0
2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_PHASE_TIME: 330
2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPILLED_RECORDS: 17
2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : 
TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : OUTPUT_RECORDS: 0
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : 
org.apache.hadoop.hive.llap.counters.LlapWmCounters:
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_QUEUED_NS: 0
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_RUNNING_NS: 0
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPECULATIVE_QUEUED_NS: 
2162643606
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPECULATIVE_RUNNING_NS: 
12151664909
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task 
[Stage-2:DEPENDENCY_COLLECTION] in serial mode
2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task 
[Stage-0:MOVE] in serial mode
2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Moving data to 
directory 
hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/calcs from 
hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/.hive-staging_hive_2018-07-03_04-57-27_351_7124633902209008283-3/-ext-10002
2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task 
[Stage-4:DDL] in serial mode
2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|ERROR : FAILED: Execution 
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:Table druid_tableau.calcs failed strict managed table 
checks due to the following reason: Table is marked as a managed table but is 
not transactional.)
2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Completed executing 
command(queryId=hive_20180703045727_c39c40d2-7d4a-46c7-a36d-7925e7c4a788); Time 
taken: 6.794 seconds
2018-07-03 04:57:36,337|INFO|Thread-721|machine.py:111 - 
tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|Error: Error while processing 
statement: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table 
druid_tableau.calcs failed strict managed table checks due to the following 
reason: Table is marked as a managed table but is not transactional.) 
(state=08S01,code=1)
{code}

This will not allow druid tables to be managed.

So its not direct to create Druid tables.

while trying to modify things to exter

[jira] [Created] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional

2018-07-06 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20112:
--

 Summary: Accumulo-Hive (managed) table creation fails with strict 
managed table checks: Table is marked as a managed table but is not 
transactional
 Key: HIVE-20112
 URL: https://issues.apache.org/jira/browse/HIVE-20112
 Project: Hive
  Issue Type: Bug
  Components: Hive, StorageHandler
Affects Versions: 3.0.0
Reporter: Dileep Kumar Chiguruvada
Assignee: Jesus Camacho Rodriguez
 Fix For: 3.0.0


Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict 
managed table checks: Table is marked as a managed table but is not 
transactional



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20123) Fix masking tests after HIVE-19617

2018-07-09 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20123:
--

 Summary: Fix masking tests after HIVE-19617
 Key: HIVE-20123
 URL: https://issues.apache.org/jira/browse/HIVE-20123
 Project: Hive
  Issue Type: Test
Affects Versions: 3.0.0, 3.1.0, 4.0.0, 3.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Masking tests results were changed inadvertently when HIVE-19617 went in, since 
table names were changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20148) Introduce rule to pushdown TopNKey through plan

2018-07-11 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20148:
--

 Summary: Introduce rule to pushdown TopNKey through plan
 Key: HIVE-20148
 URL: https://issues.apache.org/jira/browse/HIVE-20148
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Affects Versions: 4.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Teddy Choi


Follow-up for HIVE-17896, which introduces TopNKey operator.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20212) Hiveserver2 in http mode not emitting metric default.General.open_connections

2018-07-19 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20212:
--

 Summary: Hiveserver2 in http mode not emitting metric 
default.General.open_connections
 Key: HIVE-20212
 URL: https://issues.apache.org/jira/browse/HIVE-20212
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Dinesh Chitlangia
Assignee: Jesus Camacho Rodriguez
 Fix For: 3.0.0


Instances in binary mode are emitting the metric 
_default.General.open_connections_ but the instances operating in http mode are 
not emitting this metric.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20213) Upgrade Calcite to 1.17.0

2018-07-19 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20213:
--

 Summary: Upgrade Calcite to 1.17.0
 Key: HIVE-20213
 URL: https://issues.apache.org/jira/browse/HIVE-20213
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20241) Support partitioning spec in CTAS statements

2018-07-25 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20241:
--

 Summary: Support partitioning spec in CTAS statements
 Key: HIVE-20241
 URL: https://issues.apache.org/jira/browse/HIVE-20241
 Project: Hive
  Issue Type: Improvement
  Components: Parser
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently, for partitioned tables we will declare the table and insert the data 
in different operations. This issue is to extend CTAS statement to support 
specifying partition columns.

For instance:
{code:sql}
CREATE TABLE partition_ctas_1 PARTITIONED BY (key) AS
SELECT value, key FROM src where key > 200 and key < 300;
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20251) Improve message in SharedWorkOptimizer when cycles are found in the plan

2018-07-26 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20251:
--

 Summary: Improve message in SharedWorkOptimizer when cycles are 
found in the plan
 Key: HIVE-20251
 URL: https://issues.apache.org/jira/browse/HIVE-20251
 Project: Hive
  Issue Type: Improvement
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently, if there is a cycle, e.g., due to semijoin, which should not happen, 
SharedWorkOptimizer will just loop infinitely. It would be better to throw an 
Exception in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable

2018-07-28 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20263:
--

 Summary: Typo in HiveReduceExpressionsWithStatsRule variable
 Key: HIVE-20263
 URL: https://issues.apache.org/jira/browse/HIVE-20263
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-20263.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20281) SharedWorkOptimizer fails with 'operator cache contents and actual plan differ'

2018-07-31 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20281:
--

 Summary: SharedWorkOptimizer fails with 'operator cache contents 
and actual plan differ'
 Key: HIVE-20281
 URL: https://issues.apache.org/jira/browse/HIVE-20281
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 4.0.0, 3.2.0
Reporter: Ashutosh Chauhan
Assignee: Jesus Camacho Rodriguez


HIVE-18201 seems to trigger a latent bug in SW optimizer. Test 
{{subquery_in_having}} fails with:
{code}
2018-07-31T08:42:57,328 DEBUG [b68f20cc-54d5-466d-b512-1540b3a43396 main] 
optimizer.SharedWorkOptimizer: After SharedWorkExtendedOptimizer:
TS[0]-SEL[1]-MAPJOIN[131]-FIL[12]-SEL[13]-GBY[14]-RS[15]-GBY[16]-SEL[17]-MAPJOIN[136]-MAPJOIN[137]-FIL[103]-SEL[104]-FS[105]
 
-FIL[113]-SEL[20]-RS[44]-MAPJOIN[133]-SEL[47]-GBY[48]-RS[49]-GBY[50]-SEL[51]-GBY[55]-RS[98]-MAPJOIN[136]
  
-RS[88]-GBY[89]-SEL[120]-FIL[116]-SEL[91]-GBY[93]-RS[94]-GBY[95]-SEL[96]-RS[101]-MAPJOIN[137]
TS[2]-FIL[112]-GBY[5]-RS[6]-GBY[7]-SEL[8]-RS[10]-MAPJOIN[131]
 
-RS[31]-MAPJOIN[132]-FIL[33]-SEL[34]-GBY[35]-RS[36]-GBY[37]-SEL[38]-GBY[42]-MAPJOIN[133]
TS[21]-FIL[114]-SEL[22]-MAPJOIN[132]
2018-07-31T08:42:57,329 ERROR [b68f20cc-54d5-466d-b512-1540b3a43396 main] 
ql.Driver: FAILED: SemanticException Error in shared work optimizer: operator 
cache contentsand actual plan differ
org.apache.hadoop.hive.ql.parse.SemanticException: Error in shared work 
optimizer: operator cache contentsand actual plan differ
at 
org.apache.hadoop.hive.ql.optimizer.SharedWorkOptimizer.transform(SharedWorkOptimizer.java:524)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:185)
at 
org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:146)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12361)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:165)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:663)
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20301) Enable vectorization for materialized view rewriting tests

2018-08-02 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20301:
--

 Summary: Enable vectorization for materialized view rewriting tests
 Key: HIVE-20301
 URL: https://issues.apache.org/jira/browse/HIVE-20301
 Project: Hive
  Issue Type: Test
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20302) LLAP: non-vectorized execution in IO ignores virtual columns, including ROW__ID

2018-08-02 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20302:
--

 Summary: LLAP: non-vectorized execution in IO ignores virtual 
columns, including ROW__ID
 Key: HIVE-20302
 URL: https://issues.apache.org/jira/browse/HIVE-20302
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20314) Include partition pruning in materialized view rewriting

2018-08-03 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20314:
--

 Summary: Include partition pruning in materialized view rewriting
 Key: HIVE-20314
 URL: https://issues.apache.org/jira/browse/HIVE-20314
 Project: Hive
  Issue Type: Improvement
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


To be able to reduce the cost of the expression using the materialized view 
when some of its partitions are pruned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20332) Materialized views: Introduce heuristic on selectivity over ROW__ID to favour incremental rebuild

2018-08-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20332:
--

 Summary: Materialized views: Introduce heuristic on selectivity 
over ROW__ID to favour incremental rebuild
 Key: HIVE-20332
 URL: https://issues.apache.org/jira/browse/HIVE-20332
 Project: Hive
  Issue Type: Improvement
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently, we do not expose stats over {{ROW__ID.writeId}} to the optimizer. 
Even if we did, we always assume uniform distribution of the column values, 
which can easily lead to overestimations on the number of rows read when we 
filter on {{ROW__ID.writeId}} for materialized views (think about a large 
transaction for MV creation and then small ones for incremental maintenance). 
This overestimation can lead to incremental view maintenance not being 
triggered as cost of the incremental plan is overestimated (we think we will 
read more rows than we actually do). This could be fixed by introducing 
histograms that reflect better the column values distribution.

Till that moment, we will use a config variable that will set the selectivity 
for filter condition on ROW__ID during the cost calculation. Setting that 
variable to a low value will favour incremental rebuild over full rebuild.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20335) Add tests for materialized view rewriting with composite aggregation functions

2018-08-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20335:
--

 Summary: Add tests for materialized view rewriting with composite 
aggregation functions
 Key: HIVE-20335
 URL: https://issues.apache.org/jira/browse/HIVE-20335
 Project: Hive
  Issue Type: Test
  Components: Materialized views, Test
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20336) Masking and filtering policies for materialized views

2018-08-08 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20336:
--

 Summary: Masking and filtering policies for materialized views
 Key: HIVE-20336
 URL: https://issues.apache.org/jira/browse/HIVE-20336
 Project: Hive
  Issue Type: Bug
  Components: Authorization, Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Implement masking and filtering policies for materialized views.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20347) hive.optimize.sort.dynamic.partition should work with partitioned CTAS and MV

2018-08-08 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20347:
--

 Summary: hive.optimize.sort.dynamic.partition should work with 
partitioned CTAS and MV
 Key: HIVE-20347
 URL: https://issues.apache.org/jira/browse/HIVE-20347
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Affects Versions: 4.0.0, 3.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20379) Rewriting with partitioned materialized views may reference wrong column

2018-08-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20379:
--

 Summary: Rewriting with partitioned materialized views may 
reference wrong column
 Key: HIVE-20379
 URL: https://issues.apache.org/jira/browse/HIVE-20379
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild

2018-08-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20382:
--

 Summary: Materialized views: Introduce heuristic to favour 
incremental rebuild
 Key: HIVE-20382
 URL: https://issues.apache.org/jira/browse/HIVE-20382
 Project: Hive
  Issue Type: Improvement
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20391) HiveAggregateReduceFunctionsRule may infer wrong return type when decomposing aggregate function

2018-08-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20391:
--

 Summary: HiveAggregateReduceFunctionsRule may infer wrong return 
type when decomposing aggregate function
 Key: HIVE-20391
 URL: https://issues.apache.org/jira/browse/HIVE-20391
 Project: Hive
  Issue Type: Bug
  Components: CBO, Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-20391.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20520) length(CHAR) doesn't consider trailing space

2018-09-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20520:
--

 Summary: length(CHAR) doesn't consider trailing space
 Key: HIVE-20520
 URL: https://issues.apache.org/jira/browse/HIVE-20520
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Reproduce steps:

{code:java}
create table test(a char(2), b varchar(2));
insert into test values('L ', 'L ');
select length(a),length(b) from test;
+--+--+
| _c0  | _c1  |
+--+--+
| 1| 2|
+--+--+
1 row selected (0.185 seconds)
{code}

Here char with trailing spaces are trimmed, whereas leading spaces are not 
trimmed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20522) HiveFilterSetOpTransposeRule may throw assertion error due to nullability of fields

2018-09-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20522:
--

 Summary: HiveFilterSetOpTransposeRule may throw assertion error 
due to nullability of fields
 Key: HIVE-20522
 URL: https://issues.apache.org/jira/browse/HIVE-20522
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


{noformat}
[ERROR] Failures:
[ERROR]   TestMiniLlapLocalCliDriver.testCliDriver:59 Cannot add expression of 
different type to set:
set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" column1, VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" column2, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" NOT NULL column3) 
NOT NULL
expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" NOT NULL column1, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" NOT NULL column2, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" 
NOT NULL column3) NOT NULL
set is rel#260:HiveFilter.HIVE.[](input=HepRelVertex#251,condition=<($2, 
_UTF-16LE'100'))
expression is HiveFilter#262
{noformat}

q file contains examples that may to reproduce failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20537) Multi-column joins estimates with uncorrelated columns different in CBO and Hive

2018-09-11 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20537:
--

 Summary: Multi-column joins estimates with uncorrelated columns 
different in CBO and Hive
 Key: HIVE-20537
 URL: https://issues.apache.org/jira/browse/HIVE-20537
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20563) Exception in vectorization execution of CASE statement

2018-09-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20563:
--

 Summary: Exception in vectorization execution of CASE statement
 Key: HIVE-20563
 URL: https://issues.apache.org/jira/browse/HIVE-20563
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Jesus Camacho Rodriguez


With the following stacktrace:
{code}
java.lang.Exception: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
~[hadoop-mapreduce-client-common-3.1.0.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) 
[hadoop-mapreduce-client-common-3.1.0.jar:?]
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.1.0.jar:?]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
~[hadoop-mapreduce-client-core-3.1.0.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
~[hadoop-mapreduce-client-core-3.1.0.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
 ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_181]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_181]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:973)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
~[hadoop-mapreduce-client-core-3.1.0.jar:?]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
~[hadoop-mapreduce-client-core-3.1.0.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
~[hadoop-mapreduce-client-core-3.1.0.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
 ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_181]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_181]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
cstring1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:149)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:136)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:812)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMa

[jira] [Created] (HIVE-20569) hour/minute/second UDFs not working on String values with Time format

2018-09-16 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20569:
--

 Summary: hour/minute/second UDFs not working on String values with 
Time format
 Key: HIVE-20569
 URL: https://issues.apache.org/jira/browse/HIVE-20569
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 4.0.0, 3.2.0
Reporter: Jesus Camacho Rodriguez


We do not have 'TIME' type in Hive, but applying this functions on String 
values with time format used to work before HIVE-12192 was applied, e.g., 
HOUR('15:13:46').

This issue is to extend hour/minute/second to be able to parse time string 
values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20612) Create new join key correlation flag for CBO

2018-09-20 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20612:
--

 Summary: Create new join key correlation flag for CBO
 Key: HIVE-20612
 URL: https://issues.apache.org/jira/browse/HIVE-20612
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 4.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20625) Regex patterns not working in SHOW MATERIALIZED VIEWS ''

2018-09-23 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20625:
--

 Summary: Regex patterns not working in SHOW MATERIALIZED VIEWS 
''
 Key: HIVE-20625
 URL: https://issues.apache.org/jira/browse/HIVE-20625
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Reporter: kristine hahn
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-25 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20636:
--

 Summary: Improve number of null values estimation after outer join
 Key: HIVE-20636
 URL: https://issues.apache.org/jira/browse/HIVE-20636
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 4.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20669) JdbcStorageHandler push union of two different datasource to jdbc driver

2018-10-02 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20669:
--

 Summary: JdbcStorageHandler push union of two different datasource 
to jdbc driver
 Key: HIVE-20669
 URL: https://issues.apache.org/jira/browse/HIVE-20669
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Daniel Dai
Assignee: Jesus Camacho Rodriguez
 Attachments: external_jdbc_table2.q

Test case attached. The following query fail:
{code}
SELECT * FROM ext_auth1 JOIN ext_auth2 ON ext_auth1.ikey = ext_auth2.ikey
{code}
Error message:
{code}
2018-09-28T00:36:23,860 DEBUG [17b954d9-3250-45a9-995e-1b3f8277a681 main] 
dao.GenericJdbcDatabaseAccessor: Query to execute is [SELECT *
FROM (SELECT *
FROM "SIMPLE_DERBY_TABLE1"
WHERE "ikey" IS NOT NULL) AS "t"
INNER JOIN (SELECT *
FROM "SIMPLE_DERBY_TABLE2"
WHERE "ikey" IS NOT NULL) AS "t0" ON "t"."ikey" = "t0"."ikey" {LIMIT 1}]
2018-09-28T00:36:23,864 ERROR [17b954d9-3250-45a9-995e-1b3f8277a681 main] 
dao.GenericJdbcDatabaseAccessor: Error while trying to get column names.
java.sql.SQLSyntaxErrorException: Table/View 'SIMPLE_DERBY_TABLE2' does not 
exist.
at 
org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) 
~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedPreparedStatement.(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedPreparedStatement42.(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.jdbc.Driver42.newEmbedPreparedStatement(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at 
org.apache.commons.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:281)
 ~[commons-dbcp-1.4.jar:1.4]
at 
org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313)
 ~[commons-dbcp-1.4.jar:1.4]
at 
org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:74)
 [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:78) 
[hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:54) 
[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:540) 
[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:90)
 [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:77)
 [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:295)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:277) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genTablePlan(SemanticAnalyzer.java:11100)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11468)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11427)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:525)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12319)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache

[jira] [Created] (HIVE-20677) JDBC storage handler ordering problem - single split flag

2018-10-02 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20677:
--

 Summary: JDBC storage handler ordering problem - single split flag
 Key: HIVE-20677
 URL: https://issues.apache.org/jira/browse/HIVE-20677
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Reporter: Gunther Hagleitner
Assignee: Jesus Camacho Rodriguez


When Calcite is pushing queries into the JDBC handler splitting the query via 
offset/limit can cause issues (RDBMs is not guaranteed to return the data in 
the same order every time.)

For these cases we want to:

a) Add a "do not split" flag to the jdbc handler. In that mode jdbc handler 
will skip the count and offset/limit processing but just run the query in a 
single node. Flag default will be false.

b) Have calcite automatically set this flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20691) Fix org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cttl]

2018-10-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20691:
--

 Summary: Fix 
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cttl]
 Key: HIVE-20691
 URL: https://issues.apache.org/jira/browse/HIVE-20691
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Blocking all ptest runs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20696) msck_*.q tests are broken

2018-10-05 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20696:
--

 Summary: msck_*.q tests are broken
 Key: HIVE-20696
 URL: https://issues.apache.org/jira/browse/HIVE-20696
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Broke by HIVE-19617. Replaced table names but did not replaced folders paths in 
q files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20702) Account for overhead from datastructure aware estimations during mapjoin selection

2018-10-05 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20702:
--

 Summary: Account for overhead from datastructure aware estimations 
during mapjoin selection
 Key: HIVE-20702
 URL: https://issues.apache.org/jira/browse/HIVE-20702
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20704) Extend HivePreFilteringRule to support other functions

2018-10-05 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20704:
--

 Summary: Extend HivePreFilteringRule to support other functions
 Key: HIVE-20704
 URL: https://issues.apache.org/jira/browse/HIVE-20704
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20706) external_jdbc_table2.q failing intermittently

2018-10-06 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20706:
--

 Summary: external_jdbc_table2.q failing intermittently
 Key: HIVE-20706
 URL: https://issues.apache.org/jira/browse/HIVE-20706
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Collision with external_jdbc_table.q tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20716) Set default value for hive.cbo.stats.correlated.multi.key.joins to true

2018-10-09 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20716:
--

 Summary: Set default value for 
hive.cbo.stats.correlated.multi.key.joins to true
 Key: HIVE-20716
 URL: https://issues.apache.org/jira/browse/HIVE-20716
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20718) Add perf cli driver with constraints

2018-10-09 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20718:
--

 Summary: Add perf cli driver with constraints
 Key: HIVE-20718
 URL: https://issues.apache.org/jira/browse/HIVE-20718
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Now that subtasks in HIVE-17039 will be completed, it will be good to have a 
perf cli driver with constraints declaration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20727) Disable flaky test: stat_estimate_related_col.q

2018-10-11 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20727:
--

 Summary: Disable flaky test: stat_estimate_related_col.q
 Key: HIVE-20727
 URL: https://issues.apache.org/jira/browse/HIVE-20727
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20728) Enable flaky test back: stat_estimate_related_col.q

2018-10-11 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20728:
--

 Summary: Enable flaky test back: stat_estimate_related_col.q
 Key: HIVE-20728
 URL: https://issues.apache.org/jira/browse/HIVE-20728
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20729) TestJdbcWithMiniLlapArrow.testKillQuery fail frequently

2018-10-11 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20729:
--

 Summary: TestJdbcWithMiniLlapArrow.testKillQuery fail frequently
 Key: HIVE-20729
 URL: https://issues.apache.org/jira/browse/HIVE-20729
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20743) Enable udaf_context_ngrams.q and udaf_corr.q tests

2018-10-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20743:
--

 Summary: Enable udaf_context_ngrams.q and udaf_corr.q tests
 Key: HIVE-20743
 URL: https://issues.apache.org/jira/browse/HIVE-20743
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Jesus Camacho Rodriguez


Two qfile tests for TestCliDriver, they may all relate to number precision 
issues:
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_context_ngrams] 
(batchId=79)

Error:
Client Execution succeeded but contained differences (error code = 1) after 
executing udaf_context_ngrams.q 
43c43
< [{"ngram":["travelling"],"estfrequency":1.0}]
---
> [{"ngram":["travelling"],"estfrequency":3.0}]

org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_corr] (batchId=84)

Client Execution succeeded but contained differences (error code = 1) after 
executing udaf_corr.q 
100c100
< 0.6633880657639324
---
> 0.6633880657639326





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20744) Use SQL constraints to improve join reordering algorithm

2018-10-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20744:
--

 Summary: Use SQL constraints to improve join reordering algorithm
 Key: HIVE-20744
 URL: https://issues.apache.org/jira/browse/HIVE-20744
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Till now, it was all based on stats stored for the base tables and their 
columns. Now the optimizer can rely on constraints. Hence, this patch is for 
the join reordering costing to use constraints, and if it does not find any, 
rely on old code path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20748) Do not allow to enable materialized view rewriting when plan pattern is not allowed

2018-10-15 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20748:
--

 Summary: Do not allow to enable materialized view rewriting when 
plan pattern is not allowed
 Key: HIVE-20748
 URL: https://issues.apache.org/jira/browse/HIVE-20748
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


For instance, currently rewriting algorithm does not support some operators. Or 
we cannot have non-deterministic function in the MV definition. In those cases, 
we should fail either when we try to create the MV with rewriting enabled, or 
when when we enable the rewriting for a MV already created.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20755) Add the ability to push Dynamic Between and Bloom filters to JDBC handler

2018-10-16 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20755:
--

 Summary: Add the ability to push Dynamic Between and Bloom filters 
to JDBC handler
 Key: HIVE-20755
 URL: https://issues.apache.org/jira/browse/HIVE-20755
 Project: Hive
  Issue Type: New Feature
  Components: StorageHandler
Reporter: Jesus Camacho Rodriguez


HIVE-20683 has done some work to push semijoin reduction to Druid. We could use 
similar model to push them to JDBC sources, which would be quite useful, e.g., 
for joins between Hive and JDBC sources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20767) Multiple project between join operators may affect join reordering using constraints

2018-10-17 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20767:
--

 Summary: Multiple project between join operators may affect join 
reordering using constraints
 Key: HIVE-20767
 URL: https://issues.apache.org/jira/browse/HIVE-20767
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-10-18 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20775:
--

 Summary: Factor cost of each SJ reduction when costing a follow-up 
reduction
 Key: HIVE-20775
 URL: https://issues.apache.org/jira/browse/HIVE-20775
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently, while costing the SJ in a plan, the stats of the a TS that is 
reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
tree. Ideally, we could adjust the stats to take into account decisions that 
have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20778) Join reordering may not be triggered if all joins in plan are created by decorrelation logic

2018-10-19 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20778:
--

 Summary: Join reordering may not be triggered if all joins in plan 
are created by decorrelation logic
 Key: HIVE-20778
 URL: https://issues.apache.org/jira/browse/HIVE-20778
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20783) Factor reduction in cost by DPP when costing SJ reduction

2018-10-19 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20783:
--

 Summary: Factor reduction in cost by DPP when costing SJ reduction
 Key: HIVE-20783
 URL: https://issues.apache.org/jira/browse/HIVE-20783
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently, while costing the SJ in a plan, the stats of the a TS that is 
reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
tree. Ideally, we could adjust the stats to take into account decisions that 
have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20788) Extended SJ reduction may backtrack columns incorrectly when creating filters

2018-10-22 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20788:
--

 Summary: Extended SJ reduction may backtrack columns incorrectly 
when creating filters
 Key: HIVE-20788
 URL: https://issues.apache.org/jira/browse/HIVE-20788
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Affects TPCDS query 24 with constraints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20820) MV partition on clause position

2018-10-26 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20820:
--

 Summary: MV partition on clause position
 Key: HIVE-20820
 URL: https://issues.apache.org/jira/browse/HIVE-20820
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


It should obey the following syntax as per 
https://cwiki.apache.org/confluence/display/Hive/Materialized+views:
{code}
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
  [DISABLE REWRITE]
  [COMMENT materialized_view_comment]
  [PARTITIONED ON (col_name, ...)]
  [
[ROW FORMAT row_format]
[STORED AS file_format]
  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]
AS
;
{code}
Currently it is positioned just before TBLPROPERTIES.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20821) Rewrite SUM0 into SUM + COALESCE combination

2018-10-26 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20821:
--

 Summary: Rewrite SUM0 into SUM + COALESCE combination
 Key: HIVE-20821
 URL: https://issues.apache.org/jira/browse/HIVE-20821
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Since SUM0 is not vectorized, but SUM + COALESCE are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-10-26 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20822:
--

 Summary: Improvements to push computation to JDBC from Calcite
 Key: HIVE-20822
 URL: https://issues.apache.org/jira/browse/HIVE-20822
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20835) Interaction between constraints and MV rewriting may create loop in Calcite planner

2018-10-29 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20835:
--

 Summary: Interaction between constraints and MV rewriting may 
create loop in Calcite planner
 Key: HIVE-20835
 URL: https://issues.apache.org/jira/browse/HIVE-20835
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20918) Flag to enable/disable pushdown of computation from Calcite into JDBC connection

2018-11-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20918:
--

 Summary: Flag to enable/disable pushdown of computation from 
Calcite into JDBC connection
 Key: HIVE-20918
 URL: https://issues.apache.org/jira/browse/HIVE-20918
 Project: Hive
  Issue Type: Improvement
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently, there is no way to disable it. We will add a flag for that. By 
default, pushdown of computation will be enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20920) Use SQL constraints to improve join reordering algorithm (II)

2018-11-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20920:
--

 Summary: Use SQL constraints to improve join reordering algorithm 
(II)
 Key: HIVE-20920
 URL: https://issues.apache.org/jira/browse/HIVE-20920
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Follow-up of HIVE-20744.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20949) Improve PKFK cardinality estimation in physical planning

2018-11-20 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20949:
--

 Summary: Improve PKFK cardinality estimation in physical planning
 Key: HIVE-20949
 URL: https://issues.apache.org/jira/browse/HIVE-20949
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Missing case for cartesian product and full outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20950) Transform OUTER join with condition always true into INNER join

2018-11-20 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20950:
--

 Summary: Transform OUTER join with condition always true into 
INNER join
 Key: HIVE-20950
 URL: https://issues.apache.org/jira/browse/HIVE-20950
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez


For instance, it may help the join reordering algorithm.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20957) Scale down stats for plan when DPP/SJ is applied

2018-11-21 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20957:
--

 Summary: Scale down stats for plan when DPP/SJ is applied
 Key: HIVE-20957
 URL: https://issues.apache.org/jira/browse/HIVE-20957
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer, Statistics
Reporter: Jesus Camacho Rodriguez


Currently, this scale down only happens for TS itself when SJ is applied 
(HIVE-20775). It would be better to implement an approach that applies the 
reduction to the TS operator for DPP/SJ and propagates the complete stats over 
the plan accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20976) JDB Queries containing Joins gives wrong results.

2018-11-27 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20976:
--

 Summary: JDB Queries containing Joins gives wrong results. 
 Key: HIVE-20976
 URL: https://issues.apache.org/jira/browse/HIVE-20976
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa
 Fix For: 3.1.0


Druid queries that have joins against self table gives wrong results. 
e.g. 
{code} 
SELECT
username AS `username`,
SUM(double1) AS `sum_double1`
FROM
druid_table_with_nulls `tbl1`
  JOIN (
SELECT
username AS `username`,
SUM(double1) AS `sum_double2`
FROM druid_table_with_nulls
GROUP BY `username`
ORDER BY `sum_double2`
DESC  LIMIT 10
  )
  `tbl2`
ON (`tbl1`.`username` = `tbl2`.`username`)
GROUP BY `tbl1`.`username`;
{code} 

In this case one of the queries is a druid scan query and other is groupBy 
query. 
During planning, the properties of these queries are set to the tableDesc and 
serdeInfo, while setting the map work, we overwrite the properties from the 
properties present in serdeInfo, this causes the scan query results to be 
deserialized using wrong column names and results in Null values. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21006) Extend SharedWorkOptimizer to remove semijoins when there is a reutilization opportunity

2018-12-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21006:
--

 Summary: Extend SharedWorkOptimizer to remove semijoins when there 
is a reutilization opportunity
 Key: HIVE-21006
 URL: https://issues.apache.org/jira/browse/HIVE-21006
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Assume there are two TS operators in the plan over the same table, same 
columns, same conditions, etc.
The first TS operator, TS1, has an incoming SJ edge. The second TS operator, 
TS2, does not have any incoming SJ edge.
Since TS2 is reading the full table, we may just remove the SJ and TS1. Then we 
will keep and share TS2, since it reads all the data in any case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21039) CURRENT_TIMESTAMP returns value in UTC time zone

2018-12-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21039:
--

 Summary: CURRENT_TIMESTAMP returns value in UTC time zone
 Key: HIVE-21039
 URL: https://issues.apache.org/jira/browse/HIVE-21039
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.1
Reporter: Andrey Zinovyev
Assignee: Jesus Camacho Rodriguez


We're upgrading from hive 1.2 to 3.1 and it seems like new hive returns 
CURRENT_TIMESTAMP in UTC timezone. But before it was in local (system's 
default) timezone.

According to HIVE-5472 current_timestamp should use user's local timezone. This 
behaviour was changed in HIVE-12192  (if I got it right). 
GenericUDFCurrentTimestamp now explicitly uses UTC as timezone to initialise 
org.apache.hadoop.hive.common.type.Timestamp .

For example
Old hive:
{code}
hive> select current_timestamp;
OK
2018-12-12 22:43:39.024
{code}

New hive:
{code}
> select current_timestamp;
+--+
|   _c0|
+--+
| 2018-12-12 19:43:57.024  |
+--+
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21046) Push IN clause with struct values to JDBC sources

2018-12-14 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21046:
--

 Summary: Push IN clause with struct values to JDBC sources
 Key: HIVE-21046
 URL: https://issues.apache.org/jira/browse/HIVE-21046
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Since RDBMS may not support ROW operator, e.g., Derby, just rewrite into OR/AND 
clauses.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21064) java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.LazyStruct cannot be cast to org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch

2018-12-21 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21064:
--

 Summary: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazy.LazyStruct cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
 Key: HIVE-21064
 URL: https://issues.apache.org/jira/browse/HIVE-21064
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez


The error is present in one of the test files (parallel_orderby.q) and it was 
discovered while working on HIVE-16957. To reproduce:

{code}
set hive.mapred.mode=nonstrict;
set hive.stats.column.autogather=false;

create table src5_n2 (key string, value string);
load data local inpath '../../data/files/kv5.txt' into table src5_n2;
load data local inpath '../../data/files/kv5.txt' into table src5_n2;

set mapred.reduce.tasks = 4;
set hive.optimize.sampling.orderby=true;
set hive.optimize.sampling.orderby.percent=0.66f;

create table total_ordered as select * from src5_n2 order by key, value;
{code}

The create table statement throws the following exception:
{code}
java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.LazyStruct 
cannot be cast to org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
{code}

The exception is printed in the q file. This was introduced when HIVE-18910 was 
checked in, I am not sure whether it is a known issue. Cc [~djaiswal] [~jdere]




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21085) Materialized views registry starts non-external tez session

2019-01-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21085:
--

 Summary: Materialized views registry starts non-external tez 
session
 Key: HIVE-21085
 URL: https://issues.apache.org/jira/browse/HIVE-21085
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Reporter: Prasanth Jayachandran
Assignee: Jesus Camacho Rodriguez


Materialized views registry is doing SessionState.start() which will start 
regular tez session. In the presence of external tez sessions, it should get 
session from external sessions pool. There are also other places where 
SessionState.start() might be invoked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21093) Push grouping sets in Aggregate operator to JDBC sources from Calcite

2019-01-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21093:
--

 Summary: Push grouping sets in Aggregate operator to JDBC sources 
from Calcite
 Key: HIVE-21093
 URL: https://issues.apache.org/jira/browse/HIVE-21093
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21133) Add simulated materialized views useful for rewriting debugging

2019-01-17 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21133:
--

 Summary: Add simulated materialized views useful for rewriting 
debugging
 Key: HIVE-21133
 URL: https://issues.apache.org/jira/browse/HIVE-21133
 Project: Hive
  Issue Type: Improvement
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Implement simulated materialized views, useful to check whether a certain 
rewriting will be triggered. Simulated materialized views definitions will be 
stored in the user session, and they will only be used when simulation mode is 
enabled and user runs {{explain cbo}} / {{explain cbo extended}}.

{code}
set hive.simulation.enable=true;

create simulated materialized view mv1_n2 as
select * from emps_n3 where empid < 150;

explain cbo
select *
from (select * from emps_n3 where empid < 120) t
join depts_n2 using (deptno);

drop simulated materialized view mv1_n2;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21156) SharedWorkOptimizer may preserve filter in TS incorrectly

2019-01-23 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21156:
--

 Summary: SharedWorkOptimizer may preserve filter in TS incorrectly
 Key: HIVE-21156
 URL: https://issues.apache.org/jira/browse/HIVE-21156
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


For some scan reutilizations, we may end up with a Filter expression associated 
with the scan that should be removed by the optimizer. This can lead to 
incorrect results. Repro case is part of the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21184) Add Calcite plan to QueryPlan object

2019-01-29 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21184:
--

 Summary: Add Calcite plan to QueryPlan object
 Key: HIVE-21184
 URL: https://issues.apache.org/jira/browse/HIVE-21184
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Plan is more readable than full DAG. Explain formatted/extended will print the 
plan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21188) SemanticAnalyzerException for query on view with masked table

2019-01-30 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21188:
--

 Summary: SemanticAnalyzerException for query on view with masked 
table
 Key: HIVE-21188
 URL: https://issues.apache.org/jira/browse/HIVE-21188
 Project: Hive
  Issue Type: Bug
  Components: Parser
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


When view reference is fully qualified. Following q file can be used to 
reproduce the issue:

{code}
--! qt:dataset:srcpart
--! qt:dataset:src
set hive.mapred.mode=nonstrict;
set 
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest;

create database atlasmask;
use atlasmask;
create table masking_test_n8 (key int, value int);
insert into masking_test_n8 values(1,1), (2,2);
create view testv(c,d) as select * from masking_test_n8;

select * from `atlasmask`.`testv`;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21230) HiveJoinAddNotNullRule bails out for outer joins

2019-02-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21230:
--

 Summary: HiveJoinAddNotNullRule bails out for outer joins
 Key: HIVE-21230
 URL: https://issues.apache.org/jira/browse/HIVE-21230
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez


For instance, given the following query:

{code:sql}
SELECT t0.col0, t0.col1
FROM
  (
SELECT col0, col1 FROM tab
  ) AS t0
  LEFT JOIN
  (
SELECT col0, col1 FROM tab
  ) AS t1
ON t0.col0 = t1.col0 AND t0.col1 = t1.col1
{code}

we could still infer that col0 and col1 cannot be null in the right input and 
introduce the corresponding filter predicate. Currently, the rule just bails 
out if it is not an inner join.
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinAddNotNullRule.java#L79



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21231) HiveJoinAddNotNullRule support for range predicates

2019-02-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21231:
--

 Summary: HiveJoinAddNotNullRule support for range predicates
 Key: HIVE-21231
 URL: https://issues.apache.org/jira/browse/HIVE-21231
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez


For instance, given the following query:

{code:sql}
SELECT t0.col0, t0.col1
FROM
  (
SELECT col0, col1 FROM tab
  ) AS t0
  INNER JOIN
  (
SELECT col0, col1 FROM tab
  ) AS t1
ON t0.col0 < t1.col0 AND t0.col1 > t1.col1
{code}

we could still infer that col0 and col1 cannot be null for any of the inputs. 
Currently we do not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21236) SharedWorkOptimizer should check table properties

2019-02-08 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21236:
--

 Summary: SharedWorkOptimizer should check table properties
 Key: HIVE-21236
 URL: https://issues.apache.org/jira/browse/HIVE-21236
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


For instance, Calcite may have pushed computation to Druid or a JDBC source, 
rest of table structures may look the same, but the embedded query is different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21267) Extend HiveRelColumnsAlignment to reorder group-by and join keys on decreasing NDV automatically

2019-02-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21267:
--

 Summary: Extend HiveRelColumnsAlignment to reorder group-by and 
join keys on decreasing NDV automatically
 Key: HIVE-21267
 URL: https://issues.apache.org/jira/browse/HIVE-21267
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


{{HiveRelColumnsAlignment}} was introduced to align the order of columns in 
join, group-by, and order-by operators in the plan pipeline, trying to increase 
the effect of ReduceDeduplication and thus reducing data shuffle.

The optimization could be extended to reorder group-by and join keys on 
decreasing NDV, which would accelerate comparison runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21278) Ambiguity in grammar causes

2019-02-15 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21278:
--

 Summary: Ambiguity in grammar causes 
 Key: HIVE-21278
 URL: https://issues.apache.org/jira/browse/HIVE-21278
 Project: Hive
  Issue Type: Bug
  Components: Parser
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


These are the warnings at compilation time:
{code}
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_DATETIME" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_DATE {LPAREN, StringLiteral}" 
using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_UNIONTYPE LESSTHAN" using 
multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK {KW_EXISTS, KW_TINYINT}" using 
multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_STRUCT LESSTHAN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:424:5:
Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 10

As a result, alternative(s) 10 were disabled for that input
{code}
This means that multiple parser rules can match certain query text, possibly 
leading to unexpected errors at parsing time.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-19 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21293:
--

 Summary: Fix ambiguity in grammar warnings at compilation time (II)
 Key: HIVE-21293
 URL: https://issues.apache.org/jira/browse/HIVE-21293
 Project: Hive
  Issue Type: Bug
  Components: Parser
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


These are the warnings at compilation time:
{code}
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_DATETIME" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_DATE {LPAREN, StringLiteral}" 
using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_UNIONTYPE LESSTHAN" using 
multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK {KW_EXISTS, KW_TINYINT}" using 
multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
Decision can match input such as "KW_CHECK KW_STRUCT LESSTHAN" using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:424:5:
Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 10

As a result, alternative(s) 10 were disabled for that input
{code}
This means that multiple parser rules can match certain query text, possibly 
leading to unexpected errors at parsing time.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21301) Show tables statement to include views and materialized views

2019-02-20 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21301:
--

 Summary: Show tables statement to include views and materialized 
views
 Key: HIVE-21301
 URL: https://issues.apache.org/jira/browse/HIVE-21301
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


HIVE-19974 introduced backwards incompatible change, with {{SHOW TABLES}} 
statement showing only tables in the system.
This issue will restore old behavior, with {{SHOW TABLES}} showing all 
queryable entities, including views and materialized views.
In addition, {{SHOW EXTENDED TABLES}} statement is introduced, which includes 
an additional column with the table type for each of the tables listed.
Finally, the possibility to filter the show tables statements with a {{WHERE 
`table_type` = 'ANY_TYPE'}} clause is introduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21329) Custom Tez runtime unordered output buffer size depending on operator pipeline

2019-02-26 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21329:
--

 Summary: Custom Tez runtime unordered output buffer size depending 
on operator pipeline
 Key: HIVE-21329
 URL: https://issues.apache.org/jira/browse/HIVE-21329
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


For instance, if we have a reduce sink operator with no keys followed by a 
Group By (merge partial), we can decrease the output buffer size since we will 
only produce a single row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21365) Refactor Hep planner steps in CBO

2019-02-28 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21365:
--

 Summary: Refactor Hep planner steps in CBO
 Key: HIVE-21365
 URL: https://issues.apache.org/jira/browse/HIVE-21365
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Using subprograms to decrease number of planner instantiations and benefit 
fully from metadata providers caching, among other benefits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21383) JDBC storage handler: Use catalog and schema to retrieve tables if specified

2019-03-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21383:
--

 Summary: JDBC storage handler: Use catalog and schema to retrieve 
tables if specified
 Key: HIVE-21383
 URL: https://issues.apache.org/jira/browse/HIVE-21383
 Project: Hive
  Issue Type: Improvement
  Components: CBO, JDBC
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21384) Upgrade to dbcp2 in JDBC storage handler

2019-03-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21384:
--

 Summary: Upgrade to dbcp2 in JDBC storage handler
 Key: HIVE-21384
 URL: https://issues.apache.org/jira/browse/HIVE-21384
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-21384.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21385) Allow disabling pushdown of non-splittable computation to JDBC sources

2019-03-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21385:
--

 Summary: Allow disabling pushdown of non-splittable computation to 
JDBC sources
 Key: HIVE-21385
 URL: https://issues.apache.org/jira/browse/HIVE-21385
 Project: Hive
  Issue Type: Improvement
  Components: CBO, StorageHandler
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Until pushdown is cost-based decision, we will be able to enable / disable 
pushdown of operators that prevent reading results from the JDBC connection in 
parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >