[jira] [Created] (HIVE-25646) Thrift metastore URI reverse resolution could fail in some environments

2021-10-25 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-25646:


 Summary: Thrift metastore URI reverse resolution could fail in 
some environments
 Key: HIVE-25646
 URL: https://issues.apache.org/jira/browse/HIVE-25646
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.1.2, 4.0.0
Reporter: Prasanth Jayachandran


When custom URI resolver is not specified, the default thrift metastore URI 
goes through DNS reverse resolution (getCanonicalHostname) which is unlikely to 
resolve correctly when the HMS is sitting behind LBs and proxies. This is a 
change in behaviour from hive 2.x branch which isn't required. If reverse 
resolution is required, custom URI resolver can be implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24866) FileNotFoundException during alter table concat

2021-03-10 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-24866:


 Summary: FileNotFoundException during alter table concat
 Key: HIVE-24866
 URL: https://issues.apache.org/jira/browse/HIVE-24866
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.4.0, 3.2.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Because of the way combinefile IF groups files based on node and rack locality, 
there are cases where single big orc file gets spread across 2 or more combine 
hive split. When first task completes, as part of jobCloseOp the source orc 
file of concatenation is moved/renamed which can lead to FileNotFoundException 
in subsequent mappers that has partial split of that file. 

A simple fix would be for the mapper with start of the split to own the entire 
orc file for concatenation. If a mapper gets partial split which is not the 
start then it can skip the entire file. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24786) JDBC HttpClient should retry for idempotent and unsent http methods

2021-02-16 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-24786:


 Summary: JDBC HttpClient should retry for idempotent and unsent 
http methods
 Key: HIVE-24786
 URL: https://issues.apache.org/jira/browse/HIVE-24786
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When hiveserver2 is behind multiple proxies there is possibility of "broken 
pipe", "connect timeout" and "read timeout" exceptions if one of the 
intermediate proxies or load balancers decided to reset the underlying tcp 
socket after idle timeout. When the connection is broken and when the a query 
is submitted after idle timeout from beeline (or client) perspective the 
connection is open but http methods (POST/GET) fails with socket related 
exceptions. Since these methods are not sent to the server these are safe for 
client side retries. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24514) UpdateMDatabaseURI does not update managed location URI

2020-12-10 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-24514:


 Summary: UpdateMDatabaseURI does not update managed location URI
 Key: HIVE-24514
 URL: https://issues.apache.org/jira/browse/HIVE-24514
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When FS Root is updated using metatool, if the DB has managed location defined, 

updateMDatabaseURI API should update the managed location as well. Currently it 
only updates location uri.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24501) UpdateInputAccessTimeHook should not update stats

2020-12-08 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-24501:


 Summary: UpdateInputAccessTimeHook should not update stats
 Key: HIVE-24501
 URL: https://issues.apache.org/jira/browse/HIVE-24501
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


UpdateInputAccessTimeHook can fail for transactional tables with following 
exception.

The hook should skip updating the stats and only update the access time.
{code:java}
ERROR : FAILED: Hive Internal Error: 
org.apache.hadoop.hive.ql.metadata.HiveException(Unable to alter table. Cannot 
change stats state for a transactional table default.test without providing the 
transactional write state for verification (new write ID 0, valid write IDs 
default.test:8:9223372036854775807::1,2,3,4,7; current state 
{"BASIC_STATS":"true","COLUMN_STATS":{"id":"true","name":"true"}}; new state 
null)ERROR : FAILED: Hive Internal Error: 
org.apache.hadoop.hive.ql.metadata.HiveException(Unable to alter table. Cannot 
change stats state for a transactional table default.test without providing the 
transactional write state for verification (new write ID 0, valid write IDs 
default.test:8:9223372036854775807::1,2,3,4,7; current state 
{"BASIC_STATS":"true","COLUMN_STATS":{"id":"true","name":"true"}}; new state 
null)org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. 
Cannot change stats state for a transactional table default.test without 
providing the transactional write state for verification (new write ID 0, valid 
write IDs default.test:8:9223372036854775807::1,2,3,4,7; current state 
{"BASIC_STATS":"true","COLUMN_STATS":{"id":"true","name":"true"}}; new state 
null at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:821) at 
org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:769) at 
org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:756) at 
org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:70)
 at org.apache.hadoop.hive.ql.HookRunner.invokeGeneralHook(HookRunner.java:296) 
at org.apache.hadoop.hive.ql.HookRunner.runPreHooks(HookRunner.java:273) at 
org.apache.hadoop.hive.ql.Executor.preExecutionActions(Executor.java:155) at 
org.apache.hadoop.hive.ql.Executor.execute(Executor.java:107) at 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
 at 
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
 at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322)
 at java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
 at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)Caused by: MetaException(message:Cannot 
change stats state for a transactional table default.test without providing the 
transactional write state for verification (new write ID 0, valid write IDs 
default.test:8:9223372036854775807::1,2,3,4,7; current state 
{"BASIC_STATS":"true","COLUMN_STATS":{"id":"true","name":"true"}}; new state 
null) at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result.read(ThriftHiveMetastore.java)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_req(ThriftHiveMetastore.java:2584)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_req(ThriftHiveMetastore.java:2571)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:487)
 at 

[jira] [Created] (HIVE-24142) Provide config to skip umask validation during scratch dir creation

2020-09-10 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-24142:


 Summary: Provide config to skip umask validation during scratch 
dir creation
 Key: HIVE-24142
 URL: https://issues.apache.org/jira/browse/HIVE-24142
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran


When HS2 sessions create scratch dirs, it performs umask validation and it is 
mainly specific to HDFS. There are environments where scratch dirs can be in a 
different fs with paths writable but with different umask. It will be good to 
have a config to skip umask validation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24068) ReExecutionOverlayPlugin can handle DAG submission failures as well

2020-08-24 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-24068:


 Summary: ReExecutionOverlayPlugin can handle DAG submission 
failures as well
 Key: HIVE-24068
 URL: https://issues.apache.org/jira/browse/HIVE-24068
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


ReExecutionOverlayPlugin handles cases where there is a vertex failure. DAG 
submission failure can also happen in environments where AM container died 
causing DNS issues. DAG submissions are safe to retry as the DAG hasn't started 
execution yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23582) LLAP: Make SplitLocationProvider impl pluggable

2020-05-29 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23582:


 Summary: LLAP: Make SplitLocationProvider impl pluggable
 Key: HIVE-23582
 URL: https://issues.apache.org/jira/browse/HIVE-23582
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


LLAP uses HostAffinitySplitLocationProvider implementation by default. For non 
zookeeper based environments, a different split location provider may be used. 
To facilitate that make the SplitLocationProvider implementation class a 
pluggable. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23477) [LLAP] mmap allocation interruptions fails to notify other threads

2020-05-15 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23477:


 Summary: [LLAP] mmap allocation interruptions fails to notify 
other threads
 Key: HIVE-23477
 URL: https://issues.apache.org/jira/browse/HIVE-23477
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


BuddyAllocator always uses lazy allocation is mmap is enabled. If query 
fragment is interrupted at the time of arena allocation 
ClosedByInterruptionException is thrown. This exception artificially triggers 
allocator OutOfMemoryError and fails to notify other threads waiting to 
allocate arenas. 
{code:java}
2020-05-15 00:03:23.254  WARN [TezTR-128417_1_3_1_1_0] LlapIoImpl: Failed 
trying to allocate memory mapped arena
java.nio.channels.ClosedByInterruptException
at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:970)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.preallocateArenaBuffer(BuddyAllocator.java:867)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.access$1100(BuddyAllocator.java:69)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.init(BuddyAllocator.java:900)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1458)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330)
at 
org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257)
at 
org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216)
at 
org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:238)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.(VectorizedParquetRecordReader.java:160)
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50)
at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:427)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:156)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:82)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
at 
org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
at 
org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
at 

[jira] [Created] (HIVE-23476) [LLAP] Preallocate arenas for mmap case as well

2020-05-15 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23476:


 Summary: [LLAP] Preallocate arenas for mmap case as well
 Key: HIVE-23476
 URL: https://issues.apache.org/jira/browse/HIVE-23476
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


BuddyAllocator pre-allocation of arenas does not happen for mmap cache case. 
Since we are not filling up the mmap'ed buffers the upfront allocations in 
constructor is cheap. This can avoid lock free allocation of arenas later in 
the code. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23472) LLAP Guaranteed state update should trigger queue re-ordering

2020-05-14 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23472:


 Summary: LLAP Guaranteed state update should trigger queue 
re-ordering
 Key: HIVE-23472
 URL: https://issues.apache.org/jira/browse/HIVE-23472
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran


This is follow up to HIVE-23443 to handle the guaranteed state update case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23466) ZK registry base should remove only specific instance instead of host

2020-05-13 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23466:


 Summary: ZK registry base should remove only specific instance 
instead of host
 Key: HIVE-23466
 URL: https://issues.apache.org/jira/browse/HIVE-23466
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When ZKRegistryBase detects new ZK nodes it maintains path based cache and host 
based cache. The host based cached already handles multiple instances running 
in same host. But even if single instance is removed all instances belonging to 
the host are removed. 

Another issue is that, if single host has multiple instances it returns a Set 
with no ordering. Ideally, we want the newest instance to be top of the set 
(use TreeSet maybe?). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-11 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23443:


 Summary: LLAP speculative task pre-emption seems to be not working
 Key: HIVE-23443
 URL: https://issues.apache.org/jira/browse/HIVE-23443
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran


I think after HIVE-23210 we are getting a stable sort order in pre-emption 
queue and it is causing pre-emption to not work in certain cases.
{code:java}
"attempt_1589167813851__119_01_08_0 
(hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started at 
2020-05-11 05:59:22, in preemption queue, can finish)", 
"attempt_1589167813851_0008_84_01_08_1 
(hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started at 
2020-05-11 06:00:23, in preemption queue, can finish)" {code}
Scheduler only peek's at the pre-emption queue and looks at whether it is 
non-finishable. 

[https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]

In the above case, all tasks are speculative but state change is not triggering 
pre-emption queue re-ordering so peek() always returns canFinish task even 
though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23441) Support foreground option for running llap scripts

2020-05-11 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23441:


 Summary: Support foreground option for running llap scripts
 Key: HIVE-23441
 URL: https://issues.apache.org/jira/browse/HIVE-23441
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Llap scripts are always running in background. To make it container friendly, 
support foreground execution of the script as an option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23118) Option for exposing compile time counters as tez counters

2020-03-31 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23118:


 Summary: Option for exposing compile time counters as tez counters
 Key: HIVE-23118
 URL: https://issues.apache.org/jira/browse/HIVE-23118
 Project: Hive
  Issue Type: Improvement
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


TezCounters currently are runtime only. Some compile time information from 
optimizer can be exposed as counters which can then be used by workload 
management to make runtime decisions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22994) Add total file size to explain

2020-03-06 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-22994:


 Summary: Add total file size to explain
 Key: HIVE-22994
 URL: https://issues.apache.org/jira/browse/HIVE-22994
 Project: Hive
  Issue Type: Improvement
Reporter: Prasanth Jayachandran


HIVE-22979 added total file size to Statistics object for table scan operator. 
It will be very useful for debugging just from the explain output to know the 
actual on-disk file size (instead of getting describe formatted output). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22988) LLAP: If consistent splits is disabled ordering instances is not required

2020-03-06 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-22988:


 Summary: LLAP: If consistent splits is disabled ordering instances 
is not required
 Key: HIVE-22988
 URL: https://issues.apache.org/jira/browse/HIVE-22988
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


LlapTaskSchedulerService always gets consistent ordered list of all LLAP 
instances even if consistent splits is disabled. When consistent split is 
disabled ordering isn't really useful as there is no cache locality. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22979) Support total file size in statistics annotation

2020-03-05 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-22979:


 Summary: Support total file size in statistics annotation
 Key: HIVE-22979
 URL: https://issues.apache.org/jira/browse/HIVE-22979
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


Hive statistics annotation provide estimated Statistics for each operator. The 
data size provided in TableScanOperator is raw data size (after decompression 
and decoding), but there are some optimizations that can be performed based on 
total file size on disk (scan cost estimation).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22922) LLAP: ShuffleHandler may not find shuffle data if pod restarts in k8s

2020-02-21 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-22922:


 Summary: LLAP: ShuffleHandler may not find shuffle data if pod 
restarts in k8s
 Key: HIVE-22922
 URL: https://issues.apache.org/jira/browse/HIVE-22922
 Project: Hive
  Issue Type: Bug
Reporter: Nita Dembla
Assignee: Prasanth Jayachandran


Executor logs shows "Invalid map id: TTP/1.1 500 Internal Server Error". This 
happens when executor pod restarts with same hostname and port, but missing 
shuffle data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22908) AM caching connections to LLAP based on hostname and port does not work in kubernetes

2020-02-18 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-22908:


 Summary: AM caching connections to LLAP based on hostname and port 
does not work in kubernetes
 Key: HIVE-22908
 URL: https://issues.apache.org/jira/browse/HIVE-22908
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


AM is caching all connections to LLAP services using combination of hostname 
and port which does not work in kubernetes environment where hostname of pod 
and port can be same with statefulset. This causes AM to talk to old LLAP which 
could have died or OOM/Pod kill etc. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22859) Tez external sessions are leaking

2020-02-09 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-22859:


 Summary: Tez external sessions are leaking
 Key: HIVE-22859
 URL: https://issues.apache.org/jira/browse/HIVE-22859
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When Tez external/unmanaged sessions are used, TezSessionPoolManager "opens" 
the session using ApplicationId which is essentially connecting to existing 
external/unmanaged session. But when the session is returned it is not closed 
(close for external session is essentially releasing it from sessions list and 
not really killing the session). If a session is not closed, it is never 
removed from openSessions linked list that HS2 maintains hence leaking the 
session. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-21970) Avoid using RegistryUtils.currentUser()

2019-07-08 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21970:


 Summary: Avoid using RegistryUtils.currentUser()
 Key: HIVE-21970
 URL: https://issues.apache.org/jira/browse/HIVE-21970
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


RegistryUtils.currentUser() does replacement of '_' with '-' for DNS reasons. 
This is used inconsistently in some places causing issues wrt. ZK (deletion 
token secret manager, llap cluster membership for external clients).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21925) HiveConnection retries should support backoff

2019-06-26 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21925:


 Summary: HiveConnection retries should support backoff
 Key: HIVE-21925
 URL: https://issues.apache.org/jira/browse/HIVE-21925
 Project: Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran


Hive JDBC connection supports retries. In http mode, retries always seem to 
happen immediately without any backoff.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21924) Split text files if only header/footer is present

2019-06-25 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21924:


 Summary: Split text files if only header/footer is present
 Key: HIVE-21924
 URL: https://issues.apache.org/jira/browse/HIVE-21924
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 2.4.0, 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran


https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
 this piece of code makes the CSV (or any text files with header/footer) files 
not splittable if header or footer is present. 
If only header is present, we can find the offset after first line break and 
use that to split. Similarly for footer, may be read few KB's of data at the 
end and find the last line break offset. Use that to determine the data range 
which can be used for splitting. Few reads during split generation are cheaper 
than not splitting the file at all.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21913) GenericUDTFGetSplits should handle usernames in the same way as LLAP

2019-06-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21913:


 Summary: GenericUDTFGetSplits should handle usernames in the same 
way as LLAP
 Key: HIVE-21913
 URL: https://issues.apache.org/jira/browse/HIVE-21913
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


LLAP ZK registry namespacing includes current user name which is typically 
hive. But in some deployments, usernames are created with '_' like hive_dev. 
RegistryUtils.currentUser() (that LLAP uses) replaces '_' with '-' for DNS 
reasons. But GenericUDTFSplits uses UGI login user which does not do the 
underscore replacement. As a result, LlapBaseInputFormat is finding any LLAP 
daemons even though they are running. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21892:


 Summary: Trusted domain authentication should look at 
X-Forwarded-For header as well
 Key: HIVE-21892
 URL: https://issues.apache.org/jira/browse/HIVE-21892
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HIVE-21783 added trusted domain authentication. However, it looks only at 
request.getRemoteAddr() which works in most cases where there are no 
intermediate forward/reverse proxies. In trusted domain scenarios, if there 
intermediate proxies, the proxies typically append its own ip address 
"X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
represents the real client ip address. For such scenarios, add a config to 
optionally look at X-Forwarded-For header when available to determine the real 
client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21825) Improve client error msg when Active/Passive HA is enabled

2019-06-03 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21825:


 Summary: Improve client error msg when Active/Passive HA is enabled
 Key: HIVE-21825
 URL: https://issues.apache.org/jira/browse/HIVE-21825
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran


When Active/Passive HA is enabled and when client tries to connect to Passive 
HA or when HS2 is still starting up, clients will receive the following the 
error msg
{code:java}
'Cannot open sessions on an inactive HS2 instance; use service discovery to 
connect'{code}
This error msg can be improved to say that HS2 is still starting up (or more 
user-friendly error msg). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21624) LLAP: Cpu metrics and gc time metrics at thread level is broken

2019-04-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21624:


 Summary: LLAP: Cpu metrics and gc time metrics at thread level is 
broken
 Key: HIVE-21624
 URL: https://issues.apache.org/jira/browse/HIVE-21624
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0, 3.2.0
Reporter: Nita Dembla
Assignee: Prasanth Jayachandran


ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
metrics when available. At some point, the thread name which the metrics 
publisher looks for has changed causing no metrics to be published for these 
counters.  

The above counters looks for thread with name starting with "ContainerExecutor" 
but the llap task executor thread got changed to "Task-Executor"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21597) WM trigger validation should happen at the time of create or alter

2019-04-10 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21597:


 Summary: WM trigger validation should happen at the time of create 
or alter
 Key: HIVE-21597
 URL: https://issues.apache.org/jira/browse/HIVE-21597
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When a query guardrail trigger is created the trigger expression is not 
validated immediately upon creation or altering the trigger. Instead, it gets 
validated at the start of HS2 which could fail resource plans from being 
applied correctly. The trigger expression validation should happen in DDLTask. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21591) Using triggers in non-LLAP mode should not require wm queue

2019-04-08 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21591:


 Summary: Using triggers in non-LLAP mode should not require wm 
queue
 Key: HIVE-21591
 URL: https://issues.apache.org/jira/browse/HIVE-21591
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Resource plan triggers are supported in non-LLAP (tez container) mode. But 
fetching of resource plan happens only when hive.server2.tez.interactive.queue 
is set. For tez container mode, only triggers are applicable, so this queue 
dependency can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21582) Prefix msck configs with metastore

2019-04-04 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21582:


 Summary: Prefix msck configs with metastore
 Key: HIVE-21582
 URL: https://issues.apache.org/jira/browse/HIVE-21582
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HIVE-20707 moved msck configs to metastore but the configs are not prefixed 
with "metastore". Will be good to prefix it with "metastore" for consistency 
with other configs. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21527) LLAP: Table property to skip cache

2019-03-27 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21527:


 Summary: LLAP: Table property to skip cache
 Key: HIVE-21527
 URL: https://issues.apache.org/jira/browse/HIVE-21527
 Project: Hive
  Issue Type: Improvement
  Components: llap
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Similar to HIVE-21305, there can be text tables with big string columns that 
are not cache friendly (often pollutes the cache).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21497) Direct SQL exception thrown by PartitionManagementTask

2019-03-23 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21497:


 Summary: Direct SQL exception thrown by PartitionManagementTask
 Key: HIVE-21497
 URL: https://issues.apache.org/jira/browse/HIVE-21497
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


Metastore runs background thread out of which one is partition discovery. While 
removing expired partitions following exception is thrown
{code:java}
2019-03-24 04:24:59.583 WARN [PartitionDiscoveryTask-0] 
metastore.MetaStoreDirectSql: Failed to execute [select "PARTITIONS"."PART_ID" 
from "PARTITIONS" inner join "TBLS" on "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" 
and "TBLS"."TBL_NAME" = ? inner join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID" 
and "DBS"."NAME" = ? inner join "PARTITION_KEY_VALS" "FILTER0" on 
"FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 
inner join "PARTITION_KEY_VALS" "FILTER1" on "FILTER1"."PART_ID" = 
"PARTITIONS"."PART_ID" and "FILTER1"."INTEGER_IDX" = 1 inner join 
"PARTITION_KEY_VALS" "FILTER2" on "FILTER2"."PART_ID" = "PARTITIONS"."PART_ID" 
and "FILTER2"."INTEGER_IDX" = 2 where "DBS"."CTLG_NAME" = ? and ( ( (((case 
when "FILTER0"."PART_KEY_VAL" <> ? and "TBLS"."TBL_NAME" = ? and "DBS"."NAME" = 
? and "DBS"."CTLG_NAME" = ? and "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" 
and "FILTER0"."INTEGER_IDX" = 0 then cast("FILTER0"."PART_KEY_VAL" as date) 
else null end) = ?) and ("FILTER1"."PART_KEY_VAL" = ?)) and 
("FILTER2"."PART_KEY_VAL" = ?)) )] with parameters [logs, sys, hive, 
__HIVE_DEFAULT_PARTITION__, logs, sys, hive, 2019-03-23, 
warehouse-1553300821-692w, metastore-db-create-job]
javax.jdo.JDODataStoreException: Error executing SQL query "select 
"PARTITIONS"."PART_ID" from "PARTITIONS" inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ? inner join 
"DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID" and "DBS"."NAME" = ? inner join 
"PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" 
and "FILTER0"."INTEGER_IDX" = 0 inner join "PARTITION_KEY_VALS" "FILTER1" on 
"FILTER1"."PART_ID" = "PARTITIONS"."PART_ID" and "FILTER1"."INTEGER_IDX" = 1 
inner join "PARTITION_KEY_VALS" "FILTER2" on "FILTER2"."PART_ID" = 
"PARTITIONS"."PART_ID" and "FILTER2"."INTEGER_IDX" = 2 where "DBS"."CTLG_NAME" 
= ? and ( ( (((case when "FILTER0"."PART_KEY_VAL" <> ? and "TBLS"."TBL_NAME" = 
? and "DBS"."NAME" = ? and "DBS"."CTLG_NAME" = ? and "FILTER0"."PART_ID" = 
"PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 then 
cast("FILTER0"."PART_KEY_VAL" as date) else null end) = ?) and 
("FILTER1"."PART_KEY_VAL" = ?)) and ("FILTER2"."PART_KEY_VAL" = ?)) )".
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391)
at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2042)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionIdsViaSqlFilter(MetaStoreDirectSql.java:621)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:487)
at 
org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:3426)
at 
org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:3418)
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3702)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3453)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:3406)
at sun.reflect.GeneratedMethodAccessor82.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
at com.sun.proxy.$Proxy33.getPartitionsByExpr(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_partitions_req(HiveMetaStore.java:4521)
at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy34.drop_partitions_req(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropPartitions(HiveMetaStoreClient.java:1288)
at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:474)
at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:435)

[jira] [Created] (HIVE-21496) Automatic sizing of unordered buffer can overflow

2019-03-23 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21496:


 Summary: Automatic sizing of unordered buffer can overflow
 Key: HIVE-21496
 URL: https://issues.apache.org/jira/browse/HIVE-21496
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
 Attachments: hive.log

HIVE-21329 added automatic sizing of tez unordered partitioned KV buffer based 
on group by statistics. However, some corner cases for group by statistics sets 
Long.MAX for data size. This ends up setting Integer.MAX for unordered KV 
buffer size. This buffer size is expected to be in MB. Converting Integer.MAX 
value for MB to bytes will overflow and following exception is thrown.
{code:java}
2019-03-23T01:35:17,760 INFO [Dispatcher thread {Central}] 
HistoryEventHandler.criticalEvents: 
[HISTORY][DAG:dag_1553330105749_0001_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 1, taskAttemptId=attempt_1553330105749_0001_1_00_00_0, 
creationTime=1553330117468, allocationTime=1553330117524, 
startTime=1553330117562, finishTime=1553330117755, timeTaken=193, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1553330105749_0001_1_00_00_0:java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:108)
at 
org.apache.tez.runtime.common.resources.MemoryDistributor.registerRequest(MemoryDistributor.java:177)
at 
org.apache.tez.runtime.common.resources.MemoryDistributor.requestMemory(MemoryDistributor.java:110)
at 
org.apache.tez.runtime.api.impl.TezTaskContextImpl.requestInitialMemory(TezTaskContextImpl.java:214)
at 
org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput.initialize(UnorderedPartitionedKVOutput.java:76)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:537)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:520)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:505)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745){code}
 

Stats for GBY operator is getting Long.MAX_VALUE as seen below
{code:java}
2019-03-23T01:35:16,466 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
annotation.StatsRulesProcFactory: [0] STATS-TS[0] (logs): numRows: 1795 
dataSize: 4443078 basicStatsState: PARTIAL colStatsState: NONE colStats: 
{severity= colName: severity colType: string countDistincts: 359 numNulls: 89 
avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true}
2019-03-23T01:35:16,466 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
annotation.StatsRulesProcFactory: Estimating row count for 
GenericUDFOPEqual(Column[severity], Const string ERROR) Original num rows: 1795 
New num rows: 5
2019-03-23T01:35:16,467 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
annotation.StatsRulesProcFactory: [1] STATS-FIL[8]: numRows: 5 dataSize: 12376 
basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: 
severity colType: string countDistincts: 359 numNulls: 89 avgColLen: 100.0 
numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true}
2019-03-23T01:35:16,467 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
exec.FilterOperator: Setting stats (Num rows: 5 Data size: 12376 Basic stats: 
PARTIAL Column stats: NONE) on: FIL[8]
2019-03-23T01:35:16,468 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
exec.SelectOperator: Setting stats (Num rows: 5 Data size: 12376 Basic stats: 
PARTIAL Column stats: NONE) on: SEL[2]
2019-03-23T01:35:16,468 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
annotation.StatsRulesProcFactory: [1] STATS-SEL[2]: numRows: 5 dataSize: 12376 
basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: 
severity colType: string countDistincts: 359 numNulls: 89 avgColLen: 100.0 
numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true}
2019-03-23T01:35:16,471 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
annotation.StatsRulesProcFactory: STATS-GBY[3]: inputSize: 4443078 
maxSplitSize: 25600 parallelism: 1 containsGroupingSet: false 
sizeOfGroupingSet: 1
2019-03-23T01:35:16,471 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] 
annotation.StatsRulesProcFactory: 

[jira] [Created] (HIVE-21495) Calcite assertion error when UDF returns null

2019-03-22 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21495:


 Summary: Calcite assertion error when UDF returns null
 Key: HIVE-21495
 URL: https://issues.apache.org/jira/browse/HIVE-21495
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Jesus Camacho Rodriguez


Calcite throws the following error when UDFs return null. 
"current_authorizer()" for example can return null if authorizer is disabled.
{code:java}
org.apache.hive.service.cli.HiveSQLException: Error running query: 
java.lang.AssertionError: Cannot add expression of different type to set:
set type is RecordType(CHAR(7) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" NOT NULL $f0, VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f1, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f2, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" 
$f3, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f4, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f5, VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f6, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f7, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" 
$f8, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f9, CHAR(2) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" NOT NULL $f10, VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f11) NOT NULL
expression type is RecordType(CHAR(7) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" NOT NULL $f0, CHAR(7) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" NOT NULL $f1, VARCHAR(2147483647) CHARACTER 
SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f3, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" 
$f4, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f5, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f6, VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f7, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f8, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" 
$f9, CHAR(2) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" NOT 
NULL $f10, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f11) NOT NULL
set is rel#784:HiveAggregate.HIVE.[](input=HepRelVertex#783,group={0, 1, 2, 3, 
4, 5, 6, 7, 8, 9, 10, 11})
expression is HiveProject#829
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:210)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.AssertionError: Cannot add expression of different type to 
set:
set type is RecordType(CHAR(7) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" NOT NULL $f0, VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f1, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f2, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" 
$f3, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f4, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f5, VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f6, VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" $f7, 
VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" 
$f8, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f9, CHAR(2) CHARACTER SET "UTF-16LE" 

[jira] [Created] (HIVE-21482) Partition discovery table property is added to non-partitioned external tables

2019-03-19 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21482:


 Summary: Partition discovery table property is added to 
non-partitioned external tables
 Key: HIVE-21482
 URL: https://issues.apache.org/jira/browse/HIVE-21482
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Automatic partition discovery is added to external tables by default. But it 
doesn't check if the external table is partitioned or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21457) Perf optimizations in split-generation

2019-03-15 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21457:


 Summary: Perf optimizations in split-generation
 Key: HIVE-21457
 URL: https://issues.apache.org/jira/browse/HIVE-21457
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Minor split generation optimizations
 * Reuse vectorization checks
 * Reuse isAcid checks
 * Reuse filesystem objects
 * Improved logging (log at top-level instead of inside the thread pool)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21415) Parallel build is failing, trying to download incorrect hadoop-hdfs-client version

2019-03-08 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21415:


 Summary: Parallel build is failing, trying to download incorrect 
hadoop-hdfs-client version
 Key: HIVE-21415
 URL: https://issues.apache.org/jira/browse/HIVE-21415
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Running the following build command
{code:java}
mvn clean install -Pdist -DskipTests -Dpackaging.minimizeJar=false -T 1C 
-DskipShade -Dremoteresources.skip=true -Dmaven.javadoc.skip=true{code}
fails with the following exception for 3 modules (hplql, kryo-registrator, 
packaging)
{code:java}
[ERROR] Failed to execute goal on project hive-packaging: Could not resolve 
dependencies for project org.apache.hive:hive-packaging:pom:4.0.0-SNAPSHOT: 
Failure to find org.apache.hadoop:hadoop-hdfs-client:jar:2.7.3 in 
http://www.datanucleus.org/downloads/maven2 was cached in the local repository, 
resolution will not be reattempted until the update interval of datanucleus has 
elapsed or updates are forced -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn  -rf :hive-packaging{code}
 

It is trying to download 2.7.3 version but hadoop.version refers to 3.1.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21391) LLAP: Pool of column vector buffers can cause memory pressure

2019-03-05 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21391:


 Summary: LLAP: Pool of column vector buffers can cause memory 
pressure
 Key: HIVE-21391
 URL: https://issues.apache.org/jira/browse/HIVE-21391
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran


Where there are too many columns (in the order of 100s), with decimal, string 
types the column vector pool of buffers created here 
[https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/EncodedDataConsumer.java#L59]
 can cause memory pressure. 

Example:

128 (poolSize) * 300 (numCols) * 1024 (batchSize) * 80 (decimalSize) ~= 3GB

The pool size keeps increasing when there is slow consumer but fast llap io 
(SSDs) leading to GC pressure when all LLAP io threads read splits from same 
table. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21390) BI split strategy does work for blob stores

2019-03-05 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21390:


 Summary: BI split strategy does work for blob stores
 Key: HIVE-21390
 URL: https://issues.apache.org/jira/browse/HIVE-21390
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


BI split strategy cuts the split at block boundaries however there are no block 
boundaries in blob storage so we end up with 1 split for BI split strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21373) Expose query result cache info as a table

2019-03-01 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21373:


 Summary: Expose query result cache info as a table
 Key: HIVE-21373
 URL: https://issues.apache.org/jira/browse/HIVE-21373
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


To be able to look into the metadata of query result cache, like size, cache 
hit/miss, location etc. using query like
{code:java}
select * from query_cache();{code}
will be good to expose these as a queryable table. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21369) LLAP: Logging is expensive in encoded reader path

2019-03-01 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21369:


 Summary: LLAP: Logging is expensive in encoded reader path
 Key: HIVE-21369
 URL: https://issues.apache.org/jira/browse/HIVE-21369
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Nita Dembla


There should be no INFO logging in EncodedReaderImpl. Stringifying of disk 
ranges is expensive in core read path.
{code:java}
2019-03-01T17:55:56.322852142Z 2019-03-01T17:55:56,306 INFO  
[IO-Elevator-Thread-3 
(hive_20190301175546_a279f33c-4f2b-4cd5-8695-57bc8b042a61)] 
encoded.EncodedReaderImpl: Disk ranges after cache (found everything true; file 
[-3693547618692831801, 1551190876000, 1047660824], base offset 792920167): 
[{start: 887940 end: 1003508 cache buffer: 0x5165f83d(1)}, {start: 1003508 end: 
1119078 cache buffer: 0xb63cac3(1)}, {start: 1119078 end: 1234745 cache buffer: 
0x41a724fa(1)}, {start: 1234745 end: 1350261 cache buffer: 0x2f71bc38(1)}, 
{start: 1350261 end: 1465752 cache buffer: 0x2c38e1bb(1)}, {start: 1465752 end: 
1581231 cache buffer: 0x5827982(1)}, {start: 1581231 end: 1696885 cache buffer: 
0x75a6773c(1)}, {start: 1696885 end: 1812492 cache buffer: 
0x2ed060f9(1)},{start: 1812492 end: 1928086 cache buffer: 0x20b2c8aa(1)}, 
{start: 1928086 end: 2043588 cache buffer: 0x6559aacb(1)}, {start: 2043588 end: 
2159089 cache buffer: 0x569c85e1(1)}, {start: 2159089 end: 2274725 cache 
buffer: 0x25a88dd0(1)}, {start: 2274725 end: 2390228 cache buffer: 
0x738b7e87(1)}, {start: 2390228 end: 2505715 cache buffer: 0x26edafa0(1)}, 
{start: 2505715 end: 2621322 cache buffer: 0x69db7752(1)}, {start: 2621322 end: 
2736844 cache b{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-02-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21305:


 Summary: LLAP: Option to skip cache for ETL queries
 Key: HIVE-21305
 URL: https://issues.apache.org/jira/browse/HIVE-21305
 Project: Hive
  Issue Type: Improvement
  Components: llap
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


To avoid ETL queries from polluting the cache, would be good to detect such 
queries at compile time and optional skip llap io for such queries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21254) Pre-upgrade tool should handle exceptions and skip db/tables

2019-02-12 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21254:


 Summary: Pre-upgrade tool should handle exceptions and skip 
db/tables
 Key: HIVE-21254
 URL: https://issues.apache.org/jira/browse/HIVE-21254
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When exceptions like AccessControlException is thrown, pre-upgrade tool fails. 
If hive user does not have read access to database or tables (some external 
tables denies read access to hive), pre-upgrade tool should just assume they 
are external tables and move on without failing pre-upgrade process. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21244) NPE in Hive Proto Logger

2019-02-11 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21244:


 Summary: NPE in Hive Proto Logger
 Key: HIVE-21244
 URL: https://issues.apache.org/jira/browse/HIVE-21244
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


[https://github.com/apache/hive/blob/4ddc9de90b6de032d77709c9631ab787cef225d5/ql/src/java/org/apache/hadoop/hive/ql/hooks/HiveProtoLoggingHook.java#L308]
 can cause NPE. There is no uncaught exception handler for this thread. This 
NPE can silently fail and drop the event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21235) LLAP: make the name of log4j2 properties file configurable

2019-02-08 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21235:


 Summary: LLAP: make the name of log4j2 properties file configurable
 Key: HIVE-21235
 URL: https://issues.apache.org/jira/browse/HIVE-21235
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


For llap daemon, the name of llap-daemon-log4j2.properties is fixed. If a conf 
dir and jar contain the same filename, it will mess up log4j2 initialization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21223) CachedStore returns null partition when partition does not exist

2019-02-05 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21223:


 Summary: CachedStore returns null partition when partition does 
not exist
 Key: HIVE-21223
 URL: https://issues.apache.org/jira/browse/HIVE-21223
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran


CachedStore can return null partition for getPartitionWithAuth() when partition 
does not exist. null value serialization in thrift will break the connection. 
Instead if partition does not exist it should throw NoSuchObjectException.

Clients will see this exception
{code:java}
org.apache.thrift.TApplicationException: get_partition_with_auth failed: 
unknown result
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition_with_auth(ThriftHiveMetastore.java:3017)
 ~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition_with_auth(ThriftHiveMetastore.java:2990)
 ~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:1679)
 ~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:1671)
 ~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_181]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_181]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
 ~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at com.sun.proxy.$Proxy36.getPartitionWithAuthInfo(Unknown Source) ~[?:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_181]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_181]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2976)
 ~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at com.sun.proxy.$Proxy36.getPartitionWithAuthInfo(Unknown Source) ~[?:?]
at 
org.apache.hadoop.hive.metastore.SynchronizedMetaStoreClient.getPartitionWithAuthInfo(SynchronizedMetaStoreClient.java:101)
 ~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:2870) 
~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:2835) 
~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1950) 
~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at org.apache.hadoop.hive.ql.metadata.Hive$4.call(Hive.java:2490) 
~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at org.apache.hadoop.hive.ql.metadata.Hive$4.call(Hive.java:2481) 
~[hive-exec-3.1.0.3.0.100.0-266.jar:3.1.0.3.0.100.0-266]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_181]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21222) ACID: When there are no delete deltas skip finding min max keys

2019-02-05 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21222:


 Summary: ACID: When there are no delete deltas skip finding min 
max keys
 Key: HIVE-21222
 URL: https://issues.apache.org/jira/browse/HIVE-21222
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


We create an orc reader in VectorizedOrcAcidRowBatchReader.findMinMaxKeys 
(which will read 16K footer) even for cases where delete deltas does not exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21212) LLAP: shuffle port config uses internal configuration

2019-02-04 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21212:


 Summary: LLAP: shuffle port config uses internal configuration
 Key: HIVE-21212
 URL: https://issues.apache.org/jira/browse/HIVE-21212
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


LlapDaemon main() reads daemon configuration but for shuffle port it reads 
internal config instead of hive.llap.daemon.yarn.shuffle.port

[https://github.com/apache/hive/blob/c8eb03affa2533f4827cf6497e7c9873bc9520a7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java#L535]
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21103) PartitionManagementTask should not modify DN configs to avoid closing persistence manager

2019-01-08 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-21103:


 Summary: PartitionManagementTask should not modify DN configs to 
avoid closing persistence manager
 Key: HIVE-21103
 URL: https://issues.apache.org/jira/browse/HIVE-21103
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HIVE-20707 added automatic partition management which uses thread pools to run 
parallel msck repair. It also modifies datanucleus connection pool size to 
avoid explosion of connections to backend database. But object store closes the 
persistence manager when it detects a change in datanuclues or jdo configs. So 
when PartitionManagementTask is running and when HS2 tries to connect to 
metastore HS2 will get persistence manager close exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20876) Use tez provided AM registry client for external sessions

2018-11-06 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20876:


 Summary: Use tez provided AM registry client for external sessions
 Key: HIVE-20876
 URL: https://issues.apache.org/jira/browse/HIVE-20876
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Continuation to HIVE-20547, replace hive side AM external sessions registry 
with the one provided by tez. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20841) LLAP: Make dynamic ports configurable

2018-10-30 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20841:


 Summary: LLAP: Make dynamic ports configurable
 Key: HIVE-20841
 URL: https://issues.apache.org/jira/browse/HIVE-20841
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Fix For: 4.0.0, 3.2.0


Some ports in llap -> tez interaction code uses dynamic ports, provide an 
option to make them configurable to facilitate adding them to iptable rules in 
some environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20713) Use percentage for join conversion size thresholds

2018-10-08 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20713:


 Summary: Use percentage for join conversion size thresholds
 Key: HIVE-20713
 URL: https://issues.apache.org/jira/browse/HIVE-20713
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


There are many places in join conversion that rely on absolute byte sizes for 
join conversions (mapjoin, dynamic hashjoin etc.). When container sizes change, 
these join conversion thresholds have to be tuned accordingly according to the 
new container size. Instead, make the join conversions byte sizes a 
percentage/fraction of  container size. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20707) Automatic MSCK REPAIR for external tables

2018-10-07 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20707:


 Summary: Automatic MSCK REPAIR for external tables
 Key: HIVE-20707
 URL: https://issues.apache.org/jira/browse/HIVE-20707
 Project: Hive
  Issue Type: New Feature
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


In current scenario, to add partitions for external tables to metastore, MSCK 
REPAIR command has to be executed manually. To avoid this manual step, external 
tables can be specified a table property based on which a background metastore 
thread can add/drop/sync partitions periodically. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20656) Map aggregation memory configs are too aggressive

2018-09-28 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20656:


 Summary: Map aggregation memory configs are too aggressive
 Key: HIVE-20656
 URL: https://issues.apache.org/jira/browse/HIVE-20656
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran


The defaults for the following configs seems to be too aggressive. In java this 
can easily lead to several full GC pauses whose memory cannot be reclaimed.
{code:java}
HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
"Portion of total memory to be used by map-side group aggregation hash 
table"),
HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
(float) 0.9,
"The max memory to be used by map-side group aggregation hash table.\n" +
"If the memory usage is higher than this number, force to flush 
data"),{code}
 

We can be little bit conservative for these configs to avoid getting into GC 
pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20649) LLAP aware memory manager for Orc writers

2018-09-27 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20649:


 Summary: LLAP aware memory manager for Orc writers
 Key: HIVE-20649
 URL: https://issues.apache.org/jira/browse/HIVE-20649
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


ORC writer has its own memory manager that assumes memory usage or memory 
available based on JVM heap (MemoryMX bean). This works on tez container mode 
execution model but not in LLAP where container sizes (and Xmx) are typically 
high and there are multiple executors per LLAP daemon. This custom memory 
manager should be aware of memory bounds per executor. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20648) LLAP: Vector group by operator should use memory per executor

2018-09-27 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20648:


 Summary: LLAP: Vector group by operator should use memory per 
executor
 Key: HIVE-20648
 URL: https://issues.apache.org/jira/browse/HIVE-20648
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HIVE-15503 treatment has to be applied for vector group by operator as well. 
Vector group by currently uses MemoryMX bean to get heap usage and heap max 
memory which will not work for LLAP. Instead it should use memory per executor 
as upper bound to make flush decision.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20621) GetOperationStatus called in resultset.next causing incremental slowness

2018-09-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20621:


 Summary: GetOperationStatus called in resultset.next causing 
incremental slowness
 Key: HIVE-20621
 URL: https://issues.apache.org/jira/browse/HIVE-20621
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 4.0.0, 3.2.0
 Environment: Fetching result set for a result cache hit query gets 
slower as more rows are fetched. For fetching 10 row result set it took about 
900ms but fetching 200 row result set took 8 seconds. 

Reason for this slowness is GetOperationStatus is invoked inside 
resultset.next() and it happens for every row even after operation has 
completed. This is one RPC call per row fetched. 
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20583) Use canonical hostname only for kerberos auth in HiveConnection

2018-09-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20583:


 Summary: Use canonical hostname only for kerberos auth in 
HiveConnection
 Key: HIVE-20583
 URL: https://issues.apache.org/jira/browse/HIVE-20583
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20582) Make hflush in hive proto logging configurable

2018-09-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20582:


 Summary: Make hflush in hive proto logging configurable
 Key: HIVE-20582
 URL: https://issues.apache.org/jira/browse/HIVE-20582
 Project: Hive
  Issue Type: New Feature
Affects Versions: 4.0.0, 3.2.0
 Environment: Hive proto logging does hflush to avoid small files issue 
in hdfs. This may not be ideal for blobstorage where hflush gets applied only 
on closing of the file. Make hflush configurable so that blobstorage can do 
close instead of hflush. 
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20428) HiveStreamingConnection should use addPartition if not exists API

2018-08-20 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20428:


 Summary: HiveStreamingConnection should use addPartition if not 
exists API
 Key: HIVE-20428
 URL: https://issues.apache.org/jira/browse/HIVE-20428
 Project: Hive
  Issue Type: New Feature
  Components: Streaming, Transactions
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran


[https://github.com/apache/hive/blob/f280361374c6219d8734d5972c740d6d6c3fb7ef/streaming/src/java/org/apache/hive/streaming/HiveStreamingConnection.java#L379-L381]

 

catches AlreadyExistsException when adding partition. Instead use 
add_partitions API with ifNotExists set to true. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20249:


 Summary: LLAP IO: NPE during refCount decrement
 Key: HIVE-20249
 URL: https://issues.apache.org/jira/browse/HIVE-20249
 Project: Hive
  Issue Type: New Feature
  Components: llap
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


This was observed on one of the old build which was digesting the exception 
root cause.
{code:java}
Ignoring exception when closing input calls(cleanup). Exception 
class=java.lang.NullPointerException

java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_112]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20202) Add profiler endpoint to httpserver

2018-07-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20202:


 Summary: Add profiler endpoint to httpserver
 Key: HIVE-20202
 URL: https://issues.apache.org/jira/browse/HIVE-20202
 Project: Hive
  Issue Type: New Feature
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Add a web endpoint for profiling based on async-profiler. This servlet should 
be added to httpserver so that HS2 and LLAP daemons can output flamegraphs when 
their /prof endpoint is hit. Since this will be based on 
[https://github.com/jvm-profiling-tools/async-profiler] heap allocation, lock 
contentions, HW counters etc. will also be supported in addition to cpu 
profiling. In most cases the profiling overhead is pretty low and is safe to 
run on production. More analysis on CPU and memory overhead here 
[https://github.com/jvm-profiling-tools/async-profiler/issues/14] and 
[https://github.com/jvm-profiling-tools/async-profiler/issues/131] 

 

For the impatient, here is the usage doc and the sample output 
[https://github.com/prasanthj/nightswatch/blob/master/README.md] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20165) Enable ZLIB for streaming ingest

2018-07-13 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20165:


 Summary: Enable ZLIB for streaming ingest
 Key: HIVE-20165
 URL: https://issues.apache.org/jira/browse/HIVE-20165
 Project: Hive
  Issue Type: Bug
  Components: Streaming, Transactions
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Per [~gopalv]'s recommendation tried running streaming ingest with and without 
zlib. Following are the numbers

 
*Compression: NONE*
Total rows committed: 9380
Throughput: *156* rows/second
[prasanth@cn105-10 culvert]$ hdfs dfs -du -s -h 
/apps/hive/warehouse/prasanth.db/culvert
*14.1 G*  /apps/hive/warehouse/prasanth.db/culvert
 
*Compression: ZLIB*
Total rows committed: 9210
Throughput: *1535000* rows/second
[prasanth@cn105-10 culvert]$ hdfs dfs -du -s -h 
/apps/hive/warehouse/prasanth.db/culvert
*7.4 G*  /apps/hive/warehouse/prasanth.db/culvert
 
ZLIB is getting us 2x compression and only 2% lesser throughput. We should 
enable ZLIB by default for streaming ingest. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20147) Hive streaming ingest is contented on synchronized logging

2018-07-11 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20147:


 Summary: Hive streaming ingest is contented on synchronized logging
 Key: HIVE-20147
 URL: https://issues.apache.org/jira/browse/HIVE-20147
 Project: Hive
  Issue Type: Bug
  Components: Streaming
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: Screen Shot 2018-07-11 at 4.17.27 PM.png

In one of the observed profile, >30% time spent on synchronized logging. See 
attachment. 

We should use async logging for hive streaming ingest by default.  !Screen Shot 
2018-07-11 at 4.17.27 PM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20129) Revert to position based schema evolution for orc tables

2018-07-09 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20129:


 Summary: Revert to position based schema evolution for orc tables
 Key: HIVE-20129
 URL: https://issues.apache.org/jira/browse/HIVE-20129
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-20129.1.patch

Hive has been doing positional based schema evolution. ORC-54 changed it to 
column name based schema evolution causing unexpected results. Queries returned 
results earlier are now returning no results. Change the default in hive to 
positional schema evolution. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20126) VectorizedOrcInputFormat does not pass conf to orc reader options

2018-07-09 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20126:


 Summary: VectorizedOrcInputFormat does not pass conf to orc reader 
options
 Key: HIVE-20126
 URL: https://issues.apache.org/jira/browse/HIVE-20126
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


VectorizedOrcInputFormat creates Orc reader options without passing in the 
configuration object. Without it setting orc configurations will not have any 
impact. 

Example: 

set orc.force.positional.evolution=true;

does not work for positional schema evolution (will attach test case).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20116) TezTask is using parent logger

2018-07-06 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20116:


 Summary: TezTask is using parent logger
 Key: HIVE-20116
 URL: https://issues.apache.org/jira/browse/HIVE-20116
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-20116.1.patch

TezTask is using parent's logger (Task). It should instead use its own class 
name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20075) Re-enable TestTriggersWorkloadManager disabled in HIVE-20074

2018-07-03 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20075:


 Summary: Re-enable TestTriggersWorkloadManager disabled in 
HIVE-20074
 Key: HIVE-20075
 URL: https://issues.apache.org/jira/browse/HIVE-20075
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20074) Disable TestTriggersWorkloadManager as it is unstable again

2018-07-03 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20074:


 Summary: Disable TestTriggersWorkloadManager as it is unstable 
again
 Key: HIVE-20074
 URL: https://issues.apache.org/jira/browse/HIVE-20074
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20059) Hive streaming should try shade prefix unconditionally on exception

2018-07-02 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20059:


 Summary: Hive streaming should try shade prefix unconditionally on 
exception
 Key: HIVE-20059
 URL: https://issues.apache.org/jira/browse/HIVE-20059
 Project: Hive
  Issue Type: Bug
  Components: Streaming
Affects Versions: 3.1.0, 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-20059.1.patch

AbstractRecordWriter tries hive.classloader.shade.prefix on 
ClassNotFoundException but there are instances where OrcOutputFormat from old 
hive version gets loaded resulting in ClassCastException. I think we should try 
shadeprefix when defined and when any exception is thrown. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20038) Update queries on non-bucketed + partitioned tables throws NPE

2018-06-29 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20038:


 Summary: Update queries on non-bucketed + partitioned tables 
throws NPE
 Key: HIVE-20038
 URL: https://issues.apache.org/jira/browse/HIVE-20038
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 4.0.0, 3.2.0
Reporter: Kavan Suresh
Assignee: Prasanth Jayachandran


With HIVE-19890 delete deltas of non-bucketed tables are computed from ROW__ID. 
This can create holes in output paths in FSOp.commit() resulting in NPE. 

Following is the exception
{code:java}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commitOneOutPath(FileSinkOperator.java:246)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:235)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$400(FileSinkOperator.java:168)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1325)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:733)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:757)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:383){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20028) Metastore client cache config is used incorrectly

2018-06-28 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20028:


 Summary: Metastore client cache config is used incorrectly
 Key: HIVE-20028
 URL: https://issues.apache.org/jira/browse/HIVE-20028
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 4.0.0, 3.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Metastore client cache config is not used correctly. Enabling the cache 
actually disables it and vice versa. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20019) Remove commons-logging and move to slf4j

2018-06-27 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20019:


 Summary: Remove commons-logging and move to slf4j
 Key: HIVE-20019
 URL: https://issues.apache.org/jira/browse/HIVE-20019
 Project: Hive
  Issue Type: Improvement
  Components: Logging
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


Still seeing several references to commons-logging. We should move all classes 
to slf4j instead. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20004) Wrong scale used by ConvertDecimal64ToDecimal results in incorrect results

2018-06-26 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-20004:


 Summary: Wrong scale used by ConvertDecimal64ToDecimal results in 
incorrect results
 Key: HIVE-20004
 URL: https://issues.apache.org/jira/browse/HIVE-20004
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 3.0.1, 4.0.0
Reporter: Prasanth Jayachandran


ConvertDecimal64ToDecimal uses scale from output column vector which results in 
incorrect results.

Input: decimal(8,1) Output: decimal(9,2)

Input value: 963.8 gets converted to 96.38 which is wrong. The scale should not 
change this case (value should be 963.8 even after the conversion). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20000) woooohoo20000ooooooo

2018-06-26 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-2:


 Summary: whoo2ooo
 Key: HIVE-2
 URL: https://issues.apache.org/jira/browse/HIVE-2
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Affects Versions: All Versions
Reporter: Prasanth Jayachandran
 Fix For: All Versions






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19980) GenericUDTFGetSplits fails when order by query returns 0 rows

2018-06-25 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19980:


 Summary: GenericUDTFGetSplits fails when order by query returns 0 
rows
 Key: HIVE-19980
 URL: https://issues.apache.org/jira/browse/HIVE-19980
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When order by query returns 0 rows, there will not be any files in temporary 
table location for GenericUDTFGetSplits

which results in the following exception
{code:java}
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:217)
  at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:420)
  ... 52 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19964) Apply resource plan fails if trigger expression has quotes

2018-06-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19964:


 Summary: Apply resource plan fails if trigger expression has quotes
 Key: HIVE-19964
 URL: https://issues.apache.org/jira/browse/HIVE-19964
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 4.0.0
Reporter: Aswathy Chellammal Sreekumar


{code:java}
0: jdbc:hive2://localhost:1> CREATE TRIGGER global.big_hdfs_read WHEN 
HDFS_BYTES_READ > '300kb' DO KILL;
INFO : Compiling 
command(queryId=pjayachandran_20180621131017_72b1441b-d790-4db7-83ca-479735843890):
 CREATE TRIGGER global.big_hdfs_read WHEN HDFS_BYTES_READ > '300kb' DO KILL
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling 
command(queryId=pjayachandran_20180621131017_72b1441b-d790-4db7-83ca-479735843890);
 Time taken: 0.015 seconds
INFO : Executing 
command(queryId=pjayachandran_20180621131017_72b1441b-d790-4db7-83ca-479735843890):
 CREATE TRIGGER global.big_hdfs_read WHEN HDFS_BYTES_READ > '300kb' DO KILL
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=pjayachandran_20180621131017_72b1441b-d790-4db7-83ca-479735843890);
 Time taken: 0.025 seconds
INFO : OK
No rows affected (0.054 seconds)
0: jdbc:hive2://localhost:1> ALTER TRIGGER global.big_hdfs_read ADD TO 
UNMANAGED;
INFO : Compiling 
command(queryId=pjayachandran_20180621131031_dd489324-db23-412f-9409-32ba697a10e5):
 ALTER TRIGGER global.big_hdfs_read ADD TO UNMANAGED
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling 
command(queryId=pjayachandran_20180621131031_dd489324-db23-412f-9409-32ba697a10e5);
 Time taken: 0.014 seconds
INFO : Executing 
command(queryId=pjayachandran_20180621131031_dd489324-db23-412f-9409-32ba697a10e5):
 ALTER TRIGGER global.big_hdfs_read ADD TO UNMANAGED
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=pjayachandran_20180621131031_dd489324-db23-412f-9409-32ba697a10e5);
 Time taken: 0.029 seconds
INFO : OK
No rows affected (0.054 seconds)
0: jdbc:hive2://localhost:1> ALTER RESOURCE PLAN global ENABLE;
INFO : Compiling 
command(queryId=pjayachandran_20180621131036_26a5f4f3-91e3-4bec-ab42-800adb90104e):
 ALTER RESOURCE PLAN global ENABLE
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling 
command(queryId=pjayachandran_20180621131036_26a5f4f3-91e3-4bec-ab42-800adb90104e);
 Time taken: 0.012 seconds
INFO : Executing 
command(queryId=pjayachandran_20180621131036_26a5f4f3-91e3-4bec-ab42-800adb90104e):
 ALTER RESOURCE PLAN global ENABLE
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=pjayachandran_20180621131036_26a5f4f3-91e3-4bec-ab42-800adb90104e);
 Time taken: 0.021 seconds
INFO : OK
No rows affected (0.045 seconds)
0: jdbc:hive2://localhost:1> ALTER RESOURCE PLAN global ACTIVATE;
INFO : Compiling 
command(queryId=pjayachandran_20180621131037_551b2af0-321b-4638-8ac0-76771a159f4b):
 ALTER RESOURCE PLAN global ACTIVATE
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling 
command(queryId=pjayachandran_20180621131037_551b2af0-321b-4638-8ac0-76771a159f4b);
 Time taken: 0.017 seconds
INFO : Executing 
command(queryId=pjayachandran_20180621131037_551b2af0-321b-4638-8ac0-76771a159f4b):
 ALTER RESOURCE PLAN global ACTIVATE
INFO : Starting task [Stage-0:DDL] in serial mode
ERROR : FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. Invalid expression: HDFS_BYTES_READ > 
300kb
INFO : Completed executing 
command(queryId=pjayachandran_20180621131037_551b2af0-321b-4638-8ac0-76771a159f4b);
 Time taken: 0.037 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask. Invalid expression: 
HDFS_BYTES_READ > 300kb (state=08S01,code=1){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19956) Include yarn registry classes to jdbc standalone jar

2018-06-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19956:


 Summary: Include yarn registry classes to jdbc standalone jar
 Key: HIVE-19956
 URL: https://issues.apache.org/jira/browse/HIVE-19956
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HS2 Active/Passive HA requires some yarn registry classes. Include it in JDBC 
standalone jar. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19926) Remove deprecated hcatalog streaming

2018-06-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19926:


 Summary: Remove deprecated hcatalog streaming
 Key: HIVE-19926
 URL: https://issues.apache.org/jira/browse/HIVE-19926
 Project: Hive
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


hcatalog streaming is deprecated in 3.0.0. We should remove it in 4.0.0.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19886) Logs may be directed to 2 files if --hiveconf hive.log.file is used

2018-06-13 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19886:


 Summary: Logs may be directed to 2 files if --hiveconf 
hive.log.file is used
 Key: HIVE-19886
 URL: https://issues.apache.org/jira/browse/HIVE-19886
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran


hive launch script explicitly specific log4j2 configuration file to use. The 
main() methods in HiveServer2 and HiveMetastore reconfigures the logger based 
on user input via --hiveconf hive.log.file. This may cause logs to end up in 2 
different files. Initial logs goes to the file specified in 
hive-log4j2.properties and after logger reconfiguration the rest of the logs 
goes to the file specified via --hiveconf hive.log.file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19877) Remove setting hive.execution.engine as mr in HiveStreamingConnection

2018-06-12 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19877:


 Summary: Remove setting hive.execution.engine as mr in 
HiveStreamingConnection
 Key: HIVE-19877
 URL: https://issues.apache.org/jira/browse/HIVE-19877
 Project: Hive
  Issue Type: Bug
  Components: Streaming
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HiveStreamingConnection explicitly sets execution engine to mr which was from 
old code. It is no longer required. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19875) increase LLAP IO queue size for perf

2018-06-12 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19875:


 Summary: increase LLAP IO queue size for perf
 Key: HIVE-19875
 URL: https://issues.apache.org/jira/browse/HIVE-19875
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


According to [~gopalv] queue limit has perf impact, esp. during hashtable load 
for mapjoin where in the past IO used to queue up more data for processing to 
process.
1) Overall the default limit could be adjusted higher.
2) Depending on Decimal64 availability, the weight for decimal columns could be 
reduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19873) Cleanup operation log on query cancellation after some delay

2018-06-12 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19873:


 Summary: Cleanup operation log on query cancellation after some 
delay
 Key: HIVE-19873
 URL: https://issues.apache.org/jira/browse/HIVE-19873
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When a query is executed using beeline and the query is cancelled due to query 
timeout or kill query or triggers and when there is cursor on operation log row 
set, the cursor can thrown an exception as cancel will cleanup the operation 
log in the background. This can return a non-zero exit code in beeline. So add 
a delay to the cleanup of operation logging in operation cancel. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19864) Address TestTriggersWorkloadManager flakiness

2018-06-11 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19864:


 Summary: Address TestTriggersWorkloadManager flakiness
 Key: HIVE-19864
 URL: https://issues.apache.org/jira/browse/HIVE-19864
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


TestTriggersWorkloadManager seems flaky and all test cases gets timed out at 
times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19852) update jackson to latest

2018-06-10 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19852:


 Summary: update jackson to latest
 Key: HIVE-19852
 URL: https://issues.apache.org/jira/browse/HIVE-19852
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Update jackson version to latest 2.9.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19851) upgrade jQuery version

2018-06-10 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19851:


 Summary: upgrade jQuery version
 Key: HIVE-19851
 URL: https://issues.apache.org/jira/browse/HIVE-19851
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


jQuery version seems to be very old. Update to latest stable version. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19817) Hive streaming API + dynamic partitioning + json/regex writer does not work

2018-06-06 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19817:


 Summary: Hive streaming API + dynamic partitioning + json/regex 
writer does not work
 Key: HIVE-19817
 URL: https://issues.apache.org/jira/browse/HIVE-19817
 Project: Hive
  Issue Type: Bug
  Components: Streaming
Affects Versions: 3.1.0, 3.0.1, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


New streaming API for dynamic partitioning only works with delimited record 
writer. Json and Regex writers does not work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19799) remove jasper dependency

2018-06-05 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19799:


 Summary: remove jasper dependency
 Key: HIVE-19799
 URL: https://issues.apache.org/jira/browse/HIVE-19799
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


jasper dependency version looks old and unwanted. There is a comment which says 
it is required by thrift but I don't see jasper as thrift dependency. Try 
removing it to see if its safe (after precommit test run). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19794) Disable removing order by from subquery in GenericUDTFGetSplits

2018-06-05 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19794:


 Summary: Disable removing order by from subquery in 
GenericUDTFGetSplits
 Key: HIVE-19794
 URL: https://issues.apache.org/jira/browse/HIVE-19794
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


spark-llap always wraps query under a subquery, until that is removed from 
spark-llap
hive compiler is going to remove inner order by in GenericUDTFGetSplits. 
disable that optimization until then.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19792) Enable schema evolution tests for decimal 64

2018-06-04 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19792:


 Summary: Enable schema evolution tests for decimal 64
 Key: HIVE-19792
 URL: https://issues.apache.org/jira/browse/HIVE-19792
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


Following tests are disabled in HIVE-19629 as orc ConvertTreeReaderFactory does 
not handle Decimal64ColumnVectors. This jira is to re-enable those tests after 
orc supports it. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19772) Streaming ingest V2 API can generate invalid orc file if interrupted

2018-06-01 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19772:


 Summary: Streaming ingest V2 API can generate invalid orc file if 
interrupted
 Key: HIVE-19772
 URL: https://issues.apache.org/jira/browse/HIVE-19772
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.0, 3.0.1, 4.0.0
Reporter: Gopal V
Assignee: Prasanth Jayachandran


Hive streaming ingest generated 0 length and 3 byte files which are invalid orc 
files. This will throw the following exception during compaction

{code}
Error: org.apache.orc.FileFormatException: Not a valid ORC file 
hdfs://cn105-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/culvert/year=2018/month=7/delta_025_025/bucket_5
 (maxFileLength= 3) at 
org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:546) at 
org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:60) at 
org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:90) at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:1124)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:2373)
 at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:1000)
 at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:977)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19742) Fix test added in HIVE-19726 by running with -Duser.timezone="Europe/Paris"

2018-05-30 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19742:


 Summary: Fix test added in HIVE-19726 by running with 
-Duser.timezone="Europe/Paris"
 Key: HIVE-19742
 URL: https://issues.apache.org/jira/browse/HIVE-19742
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0, 4.0.0
Reporter: Prasanth Jayachandran


Make sure test added in HIVE-19726 works with Paris timezone after fixing 
ORC-370. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19726) ORC date PPD is broken

2018-05-29 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19726:


 Summary: ORC date PPD is broken
 Key: HIVE-19726
 URL: https://issues.apache.org/jira/browse/HIVE-19726
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.4.0, 3.1.0, 3.0.1, 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


When kryo was in version 2.22 we added a fix in HIVE-7222 and later in 
HIVE-10819. Now that we have updated kryo to 3.0.3 that old workaround fix was 
never removed. The issue was that kryo serialized Timestamp to Date type. So to 
recover the timestamp, during deserialization we deserialized *any* date 
instance to Timestamp object which is wrong (we don't know if date was 
serialized as date or timestamp serialized as date in first place). This breaks 
PPD on date time as kryo deserialization always converts Date to Timestamp 
breaking PPD because of type mismatch.
Now that we have newer kryo version we can remove the code added in HIVE-10819. 
  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19664) LLAP reader changes for decimal 64

2018-05-22 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19664:


 Summary: LLAP reader changes for decimal 64
 Key: HIVE-19664
 URL: https://issues.apache.org/jira/browse/HIVE-19664
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


With ORC 1.5.0, LLAP readers has to be updated to support decimal 64 readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19641) sync up hadoop version used by storage-api with hive

2018-05-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19641:


 Summary: sync up hadoop version used by storage-api with hive
 Key: HIVE-19641
 URL: https://issues.apache.org/jira/browse/HIVE-19641
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.1.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


There is hadoop version mismatch between hive and storage-api and hence 
different transitive dependency versions gets pulled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19640) dependency version upgrades/fixes/convergence

2018-05-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19640:


 Summary: dependency version upgrades/fixes/convergence
 Key: HIVE-19640
 URL: https://issues.apache.org/jira/browse/HIVE-19640
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


There are several versioning jars, some are old, some have multiple versions, 
transitive version conflicts etc.

This is umbrella jar to fix up all that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19637) Add slow test report script to testutils

2018-05-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19637:


 Summary: Add slow test report script to testutils
 Key: HIVE-19637
 URL: https://issues.apache.org/jira/browse/HIVE-19637
 Project: Hive
  Issue Type: Sub-task
  Components: Test
Affects Versions: 3.1.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Wrote the attached utility script to find top K slow tests from precommit test 
url. Would like to get that committed to testutils so that its useful for 
everyone.

{code:title=ascii mode}
$ python gen-report.py -b 11102 -a

Processing 1073 test xml reports from 
http://104.198.109.242/logs/PreCommit-HIVE-Build-11102/test-results/..

Top 25 testsuite in terms of execution time (in seconds).. [Total time: 
73882.661 seconds]

##
██  20806  TestCliDriver
███  9601  
TestMiniLlapLocalCliDriver
███  8210  TestSparkCliDriver
██   2744  TestMinimrCliDriver
█2262  
TestEncryptedHDFSCliDriver
 2021  
TestMiniSparkOnYarnCliDriver
 1808  TestHiveCli
███  1566  TestMiniLlapCliDriver
███  1345  
TestReplicationScenarios
██   1238  
TestMiniDruidCliDriver
██940  TestNegativeCliDriver
██865  TestHBaseCliDriver
█ 681  TestMiniTezCliDriver
█ 555  
TestTxnCommands2WithSplitUpdateAndVectorization
█ 543  TestCompactor
█ 528  TestTxnCommands2
  378  TestStreaming
  374  
TestBlobstoreCliDriver
  328  
TestNegativeMinimrCliDriver
  302  
TestTxnCommandsWithSplitUpdateAndVectorization
  301  TestHCatClient
  299  TestTxnCommands
  261  TestTxnLoadData
  258  TestAcidOnTez
  240  
TestHBaseNegativeCliDriver

Top 25 testcases in terms of execution time (in seconds).. [Total time: 
63102.607 seconds]

###
██  680  
TestMinimrCliDriver_testCliDriver[infer_bucket_sort_reducers_power_two]
█   623  
TestMinimrCliDriver_testCliDriver[infer_bucket_sort_map_operators]
███ 429  
TestMinimrCliDriver_testCliDriver[infer_bucket_sort_dyn_part]
███ 374  
TestSparkCliDriver_testCliDriver[vectorization_short_regress]
███ 374  
TestMiniLlapLocalCliDriver_testCliDriver[vectorization_short_regress]
330  
TestMiniDruidCliDriver_testCliDriver[druidmini_dynamic_partition]
█   238  
TestMiniLlapLocalCliDriver_testCliDriver[vector_outer_join5]
227  
TestMiniDruidCliDriver_testCliDriver[druidmini_test_insert]
███ 214  
TestEncryptedHDFSCliDriver_testCliDriver[encryption_auto_purge_tables]
███ 211  
TestMiniLlapCliDriver_testCliDriver[unionDistinct_1]
███ 210  
TestMiniSparkOnYarnCliDriver_testCliDriver[vector_outer_join5]
███ 206  
TestMinimrCliDriver_testCliDriver[bucket_num_reducers_acid]
██  202  
TestMinimrCliDriver_testCliDriver[infer_bucket_sort_merge]
██  198  
TestCliDriver_testCliDriver[typechangetest]

[jira] [Created] (HIVE-19636) Fix druidmini_dynamic_partition.q slowness

2018-05-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19636:


 Summary: Fix druidmini_dynamic_partition.q slowness
 Key: HIVE-19636
 URL: https://issues.apache.org/jira/browse/HIVE-19636
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.1.0
Reporter: Prasanth Jayachandran


druidmini_dynamic_partition.q runs for >5 mins



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19635) Fix vectorization_short_regress slowness

2018-05-21 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19635:


 Summary: Fix vectorization_short_regress slowness
 Key: HIVE-19635
 URL: https://issues.apache.org/jira/browse/HIVE-19635
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.1.0
Reporter: Prasanth Jayachandran


vectorization_short_regress.q file runs for >5 mins on each cli drivers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   >