[jira] [Created] (HIVE-22466) Prevent synchronized calls in LlapDaemon configuration

2019-11-06 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-22466:
---

 Summary: Prevent synchronized calls in LlapDaemon configuration
 Key: HIVE-22466
 URL: https://issues.apache.org/jira/browse/HIVE-22466
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0
Reporter: Mustafa Iman
Assignee: Mustafa Iman


LlapDaemonConfig extends from Hadoop's Configuration class. Configuration class 
makes use of synchronized calls for both reads and writes to configuration. 
LlapDaemon does not change configuration in runtime. We can remove synchronized 
access to configuration in LlapConfiguration



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22537) getAcidState() not saving directory snapshot causes multiple calls to S3 api

2019-11-25 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-22537:
---

 Summary: getAcidState() not saving directory snapshot causes 
multiple calls to S3 api
 Key: HIVE-22537
 URL: https://issues.apache.org/jira/browse/HIVE-22537
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Mustafa Iman
Assignee: Mustafa Iman


Fix for HIVE-21225 is not enabled in query coordinator codepath. The last 
argument (generateDirSnapshots) for getAcidState() is set to false when invoked 
by callInternal(). Also, snapshot is not used for file exists calls.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22555) Upgrade ORC version to 1.5.8

2019-11-27 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-22555:
---

 Summary: Upgrade ORC version to 1.5.8
 Key: HIVE-22555
 URL: https://issues.apache.org/jira/browse/HIVE-22555
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Mustafa Iman
Assignee: Mustafa Iman


Hive currently depends on ORC 1.5.6. We need 1.5.8 upgrade for 
https://issues.apache.org/jira/browse/HIVE-22499

ORC-1.5.7 includes https://issues.apache.org/jira/browse/ORC-361 . It causes 
some tests overriding MemoryManager to fail. These need to be addressed while 
upgrading.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22877) Wrong decimal boundary for casting to Decimal64

2020-02-11 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-22877:
---

 Summary: Wrong decimal boundary for casting to Decimal64
 Key: HIVE-22877
 URL: https://issues.apache.org/jira/browse/HIVE-22877
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 4.0.0
Reporter: Mustafa Iman
Assignee: Mustafa Iman


During vectorization, decimal fields that are obtained via generic udfs are 
cast to Decimal64 in some circumstances. For decimal to decimal64 cast, hive 
compares the source column's `scale + precision` to 18(maximum number of digits 
that can be represented by a long). A decimal can fit in a long as long as its 
`scale` is smaller than or equal to 18. Precision is irrelevant.

Since vectorized generic udf expression takes precision into account, it 
computes wrong output column vector: Decimal instead of Decimal64. This in turn 
causes ClassCastException down the operator chain.

Below query fails with class cast exception:

 
{code:java}
create table mini_store
(
 s_store_sk int,
 s_store_id string
)
row format delimited fields terminated by '\t'
STORED AS ORC;

create table mini_sales
(
 ss_store_sk int,
 ss_quantity int,
 ss_sales_price decimal(7,2)
)
row format delimited fields terminated by '\t'
STORED AS ORC;
insert into mini_store values (1, 'store');
insert into mini_sales values (1, 2, 1.2);
select s_store_id, coalesce(ss_sales_price*ss_quantity,0) sumsales
from mini_sales, mini_store where ss_store_sk = s_store_sk
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23175) Skip serializing hadoop and tez config on HS side

2020-04-09 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23175:
---

 Summary: Skip serializing hadoop and tez config on HS side
 Key: HIVE-23175
 URL: https://issues.apache.org/jira/browse/HIVE-23175
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


HiveServer spends a lot of time serializing configuration objects. We can skip 
putting hadoop and tez config xml files in payload assuming that the configs 
are the same on both HS and AM side. This depends on Tez to load local xml 
configs when creating config objects 
https://issues.apache.org/jira/browse/TEZ-4141



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23180) Remove unused variables from tez build dag

2020-04-10 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23180:
---

 Summary: Remove unused variables from tez build dag
 Key: HIVE-23180
 URL: https://issues.apache.org/jira/browse/HIVE-23180
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman
 Attachments: HIVE-23180.patch

This is a simple refactoring around TezTask build dag functionality. Unused 
options are removed from function calls. Also some variables are given 
meaningful names. Gets rid of unneccessary filesystem creation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23191) Prevent redundant output descriptor config serialization

2020-04-13 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23191:
---

 Summary: Prevent redundant output descriptor config serialization
 Key: HIVE-23191
 URL: https://issues.apache.org/jira/browse/HIVE-23191
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


{code:java}
DagUtils#createVertex(JobConf, BaseWork, Path,
 TezWork, Map){code}
creates an output descriptor if it is leaf vertex. It uses the same config 
object that is used in processor descriptor. It should not create payload from 
scratch when processor descriptor has the identical payload.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23448) Remove hive-site.xml from input/output/processor payload

2020-05-11 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23448:
---

 Summary: Remove hive-site.xml from input/output/processor payload
 Key: HIVE-23448
 URL: https://issues.apache.org/jira/browse/HIVE-23448
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Reporter: Mustafa Iman
Assignee: Mustafa Iman


Depends on https://jira.apache.org/jira/browse/TEZ-4137?filter=-1

We remove most xml configs from payloads in 
https://jira.apache.org/jira/browse/HIVE-23175 

However, hive-site.xml could not be removed from those configs in early stage 
for reasons outlined in that jira.

This Jira removes hive-site.xml configs from configuration just before 
serializing payloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23455) Improve error message for external orc table

2020-05-12 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23455:
---

 Summary: Improve error message for external orc table
 Key: HIVE-23455
 URL: https://issues.apache.org/jira/browse/HIVE-23455
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


Since there is no schema validation for external tables, users may face various 
errors if their orc data and external table schema does not match. If orc 
schema has fewer columns than projection OrcEncodedDataConsumer may receive an 
incomplete TypeDescription array which will manifest itself as 
NullPointerException later.

We can at least verify that OrcEncodedDataConsumer gets enough 
TypeDescriptions. If assertion fails, user sees there is something wrong with 
the schema and hopefully resolves the problem quickly. If there are enough 
columns in the file but the schema of the query does not match, user generally 
sees a ClassCastException. If there are enough columns and types accidentally 
match, there is nothing we can do as this is an external table.

We have seen this when trying to use a managed table as external table 
location. Although user facing schemas are the same, managed table has acid 
related metadata. I am adding a q file demonstrating NullPointerException with 
TestMiniLlapLocalCliDriver and the output after the fix. I haven't added this 
to precommit tests as it is hard to assert the exception message from mini 
driver framework and effectively it is just changing the error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23529) CTAS is broken for uniontype when row_deserialize

2020-05-21 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23529:
---

 Summary: CTAS is broken for uniontype when row_deserialize
 Key: HIVE-23529
 URL: https://issues.apache.org/jira/browse/HIVE-23529
 Project: Hive
  Issue Type: Bug
Reporter: Mustafa Iman
Assignee: Mustafa Iman


CTAS queries fail when there is a uniontype in source table and 
hive.vectorized.use.vector.serde.deserialize=false.

ObjectInspectorUtils.copyToStandardObject in ROW_DESERIALIZE path extracts the 
value from union type. However, VectorAssignRow expects a StandardUnion object 
causing ClassCastException for any CTAS query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23584) Dont send default configs(HiveConf) to AM and tasks

2020-05-30 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23584:
---

 Summary: Dont send default configs(HiveConf) to AM and tasks
 Key: HIVE-23584
 URL: https://issues.apache.org/jira/browse/HIVE-23584
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
 Attachments: hiveconf.wip.patch

About 80% of the configs left after HIVE-23175 are default settings coming from 
HiveConf object. We can remove these from payload also. Only problem is that 
TezTask relies on some of these configs when building dag. We can explicitly 
add those settings that are needed in dag build phase in HiveServer2. The rest 
is ok to be removed from payload.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23589) Eliminate blocking call to scheduler for isguaranteed

2020-06-01 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23589:
---

 Summary: Eliminate blocking call to scheduler for isguaranteed
 Key: HIVE-23589
 URL: https://issues.apache.org/jira/browse/HIVE-23589
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


LlapTaskCommunicator requires isGuaranteed information from LlapTaskScheduler 
just before sending the actual request to the executor. There is no particular 
ordering between the writing the submit request to the network and scheduler 
decisions. Therefore, we can just put the initial isGuaranteed information in 
task allocation rather than calling back scheduler.

Depends on https://jira.apache.org/jira/browse/TEZ-4192



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23595) Do not query task guaranteed status when wlm off

2020-06-02 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23595:
---

 Summary: Do not query task guaranteed status when wlm off
 Key: HIVE-23595
 URL: https://issues.apache.org/jira/browse/HIVE-23595
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


LlapTaskCommunicator queries scheduler for every task guaranteed status. When 
workload management is off it is always false. There is no need for the 
synchronous check.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23596) Encode guaranteed task information in containerId

2020-06-02 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23596:
---

 Summary: Encode guaranteed task information in containerId
 Key: HIVE-23596
 URL: https://issues.apache.org/jira/browse/HIVE-23596
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


We should avoid calling LlapTaskScheduler to get initial isguaranteed flag for 
all the tasks. It causes arbitrary delays in sending tasks out. Since 
communicator is a single thread, any blocking there delays all the tasks.

There are [https://jira.apache.org/jira/browse/TEZ-4192] and 
[https://jira.apache.org/jira/browse/HIVE-23589] for a proper solution to this. 
However, that requires a Tez release which seems far right now. We can replace 
the current hack with another hack that does not require locking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23629) Enforce clean findbugs in PRs

2020-06-06 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23629:
---

 Summary: Enforce clean findbugs in PRs
 Key: HIVE-23629
 URL: https://issues.apache.org/jira/browse/HIVE-23629
 Project: Hive
  Issue Type: Sub-task
Reporter: Mustafa Iman
Assignee: Mustafa Iman


We should start enforcing clean findbugs reports as soon as we fix a module. 
Otherwise, it will continue collecting findbugs errors. We can add a stage to 
Jenkins pipeline to enforce findbugs and possibly other checks. It will 
selectively run findbugs for specified sub modules. Eventually we can get rid 
of the list and enable findbugs for the whole project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23686) Fix Spotbugs issues in hive-shims

2020-06-13 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23686:
---

 Summary: Fix Spotbugs issues in hive-shims
 Key: HIVE-23686
 URL: https://issues.apache.org/jira/browse/HIVE-23686
 Project: Hive
  Issue Type: Sub-task
Reporter: Mustafa Iman
Assignee: Mustafa Iman






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23687) Fix Spotbugs issues in hive-standalone-metastore

2020-06-13 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23687:
---

 Summary: Fix Spotbugs issues in hive-standalone-metastore
 Key: HIVE-23687
 URL: https://issues.apache.org/jira/browse/HIVE-23687
 Project: Hive
  Issue Type: Sub-task
Reporter: Mustafa Iman
Assignee: Mustafa Iman






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23744) Reduce query startup latency

2020-06-22 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23744:
---

 Summary: Reduce query startup latency
 Key: HIVE-23744
 URL: https://issues.apache.org/jira/browse/HIVE-23744
 Project: Hive
  Issue Type: Task
  Components: llap
Affects Versions: 4.0.0
Reporter: Mustafa Iman
Assignee: Mustafa Iman
 Attachments: am_schedule_and_transmit.png, task_start.png

When I run queries with large number of tasks for a single vertex, I see a 
significant delay before all tasks start execution in llap daemons. 

Although llap daemons have the free capacity to run the tasks, it takes a 
significant time to schedule all the tasks in AM and actually transmit them to 
executors.

"am_schedule_and_transmit" shows scheduling of tasks of tpcds query 55. It 
shows only the tasks scheduled for one of 10 llap daemons. The scheduler works 
in a single thread, scheduling tasks one by one. A delay in scheduling of one 
task, delays all the tasks.

!am_schedule_and_transmit.png|width=831,height=573!

 

Another issue is that it takes long time to fill all the execution slots in 
llap daemons even though they are all empty initially. This is caused by 
LlapTaskCommunicator using a fixed number of threads (10 by default) to send 
the tasks to daemons. Also this communication is synchronized so these threads 
block communication staying idle. "task_start.png" shows running tasks on an 
llap daemon that has 12 execution slots. By the time 12th task starts running, 
more than 100ms already passes. That slot stays idle all this time. 

!task_start.png|width=1166,height=635!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23745) Avoid copying userpayload in task communicator

2020-06-22 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23745:
---

 Summary: Avoid copying userpayload in task communicator
 Key: HIVE-23745
 URL: https://issues.apache.org/jira/browse/HIVE-23745
 Project: Hive
  Issue Type: Sub-task
Reporter: Mustafa Iman
Assignee: Mustafa Iman


[https://github.com/apache/hive/blob/master/llap-common/src/java/org/apache/hadoop/hive/llap/tez/Converters.java#L182]
 I see this copy take a few milliseconds sometimes. Delay here adds up for all 
tasks of a single vertex in LlapTaskCommunicator as it processes tasks one by 
one. User payload never changes in this codepath. Copy is made because of 
limitations of Protobuf library. Protobuf adds a UnsafeByteOperations class 
that avoid copying of ByteBuffers in 3.1 version. This can be resolved when 
Protobuf is upgraded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23746) Send task attempts async from AM to daemons

2020-06-22 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23746:
---

 Summary: Send task attempts async from AM to daemons
 Key: HIVE-23746
 URL: https://issues.apache.org/jira/browse/HIVE-23746
 Project: Hive
  Issue Type: Sub-task
  Components: llap
Reporter: Mustafa Iman
Assignee: Mustafa Iman


LlapTaskCommunicator uses sync client to send task attempts. There are fixed 
number of communication threads (10 by default). This causes unneccessary 
delays when there are enough free execution slots in daemons but they do not 
receive all the tasks because of this bottleneck. LlapTaskCommunicator can use 
an async client to pass these tasks to daemons. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23747) Increase the number of parallel tasks sent to daemons from am

2020-06-22 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23747:
---

 Summary: Increase the number of parallel tasks sent to daemons 
from am
 Key: HIVE-23747
 URL: https://issues.apache.org/jira/browse/HIVE-23747
 Project: Hive
  Issue Type: Sub-task
Reporter: Mustafa Iman
Assignee: Mustafa Iman


The number of inflight tasks from AM to a single executor is hardcoded to 1 
currently([https://github.com/apache/hive/blob/master/llap-client/src/java/org/apache/hadoop/hive/llap/tez/LlapProtocolClientProxy.java#L57]
 ). It does not make sense to increase this right now as communication between 
am and daemons happen synchronously anyway. After resolving 
https://issues.apache.org/jira/browse/HIVE-23746 this must be increased to at 
least number of execution slots per daemon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23780) Fail dropTable if acid cleanup fails

2020-06-29 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23780:
---

 Summary: Fail dropTable if acid cleanup fails
 Key: HIVE-23780
 URL: https://issues.apache.org/jira/browse/HIVE-23780
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Standalone Metastore, Transactions
Reporter: Mustafa Iman
Assignee: Mustafa Iman


Acid cleanup happens after dropTable is committed. If cleanup fails for some 
reason, there are leftover entries in acid tables. This later causes dropped 
table's name to be unusable by new tables.

[~pvary] [~ngangam]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23830) Remove shutdownhook after query is completed

2020-07-09 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23830:
---

 Summary: Remove shutdownhook after query is completed
 Key: HIVE-23830
 URL: https://issues.apache.org/jira/browse/HIVE-23830
 Project: Hive
  Issue Type: Bug
Reporter: Mustafa Iman
Assignee: Mustafa Iman


Each query registers a shutdownHook to release transactional resources in case 
JVM shuts down mid query. These hooks are not cleaned up until session is 
closed. Session life time is unbounded. So these hooks are a memory leak. They 
should be cleaned as soon as transaction is completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23975) Reuse evicted keys from aggregation buffers

2020-08-02 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-23975:
---

 Summary: Reuse evicted keys from aggregation buffers
 Key: HIVE-23975
 URL: https://issues.apache.org/jira/browse/HIVE-23975
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24093) Remove unused hive.debug.localtask

2020-08-30 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-24093:
---

 Summary: Remove unused hive.debug.localtask
 Key: HIVE-24093
 URL: https://issues.apache.org/jira/browse/HIVE-24093
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


hive.debug.local.task was added in HIVE-1642. Even then, it was never used. It 
was possibly a leftover from development/debugging. There are no references to 
either HIVEDEBUGLOCALTASK or hive.debug.localtask in the codebase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24139) VectorGroupByOperator is not flushing hash table entries as needed

2020-09-09 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-24139:
---

 Summary: VectorGroupByOperator is not flushing hash table entries 
as needed
 Key: HIVE-24139
 URL: https://issues.apache.org/jira/browse/HIVE-24139
 Project: Hive
  Issue Type: Bug
Reporter: Mustafa Iman
Assignee: Mustafa Iman


After https://issues.apache.org/jira/browse/HIVE-23975 introduced a bug where 
copyKey mutates some key wrappers while copying. This Jira is to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24270) Move scratchdir cleanup to background

2020-10-13 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-24270:
---

 Summary: Move scratchdir cleanup to background
 Key: HIVE-24270
 URL: https://issues.apache.org/jira/browse/HIVE-24270
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


In cloud environment, scratchdir cleaning at the end of the query may take long 
time. This causes client to hang up to 1 minute even after the results were 
streamed back. During this time client just waits for cleanup to finish. 
Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24350) NullScanTaskDispatcher should use stats

2020-11-02 Thread Mustafa Iman (Jira)
Mustafa Iman created HIVE-24350:
---

 Summary: NullScanTaskDispatcher should use stats
 Key: HIVE-24350
 URL: https://issues.apache.org/jira/browse/HIVE-24350
 Project: Hive
  Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman


NullScanTaskDispatcher manually checks each partition directory to see if they 
are empty. While this is necessary in external tables, we can just use stats 
for managed tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)