[jira] [Created] (HIVE-26177) Create a new connection pool for compaction (DataNucleus)

2022-04-26 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-26177:
--

 Summary: Create a new connection pool for compaction (DataNucleus)
 Key: HIVE-26177
 URL: https://issues.apache.org/jira/browse/HIVE-26177
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26176) Create a new connection pool for compaction (CompactionTxnHandler)

2022-04-26 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-26176:
--

 Summary: Create a new connection pool for compaction 
(CompactionTxnHandler)
 Key: HIVE-26176
 URL: https://issues.apache.org/jira/browse/HIVE-26176
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26155) Create a new connection pool for compaction

2022-04-19 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-26155:
--

 Summary: Create a new connection pool for compaction
 Key: HIVE-26155
 URL: https://issues.apache.org/jira/browse/HIVE-26155
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Currently the TxnHandler uses 2 connection pools to communicate with the HMS: 
the default one and one for mutexing. If compaction is configured incorrectly 
(e.g. too many Initiators are running on the same db) then compaction can use 
up all the connections in the default connection pool and all user queries can 
get stuck.

We should have a separate connection pool (configurable size) just for 
compaction-related activities.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26060) Invalidate acid table directory cache on drop table

2022-03-22 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-26060:
--

 Summary: Invalidate acid table directory cache on drop table
 Key: HIVE-26060
 URL: https://issues.apache.org/jira/browse/HIVE-26060
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0-alpha-1
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26048) Missing quotation mark in findReadyToClean query

2022-03-18 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-26048:
--

 Summary: Missing quotation mark in findReadyToClean query
 Key: HIVE-26048
 URL: https://issues.apache.org/jira/browse/HIVE-26048
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Missing quotation mark causes postgres column error

{code}
2022-03-18T00:53:43,314 ERROR [Thread-651] compactor.Cleaner: Caught an 
exception in the main loop of compactor cleaner, MetaException(message:Unable 
to connect to transaction database org.postgresql.util.PSQLException: ERROR: 
column "cq_retry_retention" does not exist
  Position: 485
  at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2433)
  at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2178)
  at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306)
  at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
  at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
  at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:307)
  at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:293)
  at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:270)
  at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:224)
  at 
org.apache.hive.com.zaxxer.hikari.pool.ProxyStatement.executeQuery(ProxyStatement.java:108)
  at 
org.apache.hive.com.zaxxer.hikari.pool.HikariProxyStatement.executeQuery(HikariProxyStatement.java)
  at 
org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.findReadyToClean(CompactionTxnHandler.java:374)
  at org.apache.hadoop.hive.ql.txn.compactor.Cleaner.run(Cleaner.java:146)
)
  at 
org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.findReadyToClean(CompactionTxnHandler.java:397)
  at org.apache.hadoop.hive.ql.txn.compactor.Cleaner.run(Cleaner.java:146)
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25986) statement id in incorrect in case of load in path to MM table

2022-02-25 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25986:
--

 Summary: statement id in incorrect in case of load in path to MM 
table
 Key: HIVE-25986
 URL: https://issues.apache.org/jira/browse/HIVE-25986
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25862) Persist the time of last run in the initiator

2022-01-11 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25862:
--

 Summary: Persist the time of last run in the initiator
 Key: HIVE-25862
 URL: https://issues.apache.org/jira/browse/HIVE-25862
 Project: Hive
  Issue Type: Improvement
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


The time of last run is used as a filter when finding compaction candidates.
Because its only stored in memory, we lose this filtering capability if the 
service restarts, so it would make sense to persist it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25252) All new mewLower case

2021-06-15 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25252:
--

 Summary: All new mewLower case
 Key: HIVE-25252
 URL: https://issues.apache.org/jira/browse/HIVE-25252
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25215) tables_with_x_aborted_transactions should count partition/unpartitioned tables

2021-06-07 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25215:
--

 Summary: tables_with_x_aborted_transactions should count 
partition/unpartitioned tables
 Key: HIVE-25215
 URL: https://issues.apache.org/jira/browse/HIVE-25215
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Initiator compare's each partition's number of aborts to 
hive.compactor.abortedtxn.threshold, so tables_with_x_aborted_transactions 
should reflect the number of partitions/unpartitioned tables with >x aborts, 
instead of the number of tables with >x aborts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25081) Put metrics collection behind a feature flag

2021-04-30 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25081:
--

 Summary: Put metrics collection behind a feature flag
 Key: HIVE-25081
 URL: https://issues.apache.org/jira/browse/HIVE-25081
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Most metrics we're creating are collected in AcidMetricsService, which is 
behind a feature flag. However there are some metrics that are collected 
outside of the service. These should be behind a feature flag in addition to 
hive.metastore.metrics.enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25080) Create metric about oldest entry in "ready for cleaning" state

2021-04-30 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25080:
--

 Summary: Create metric about oldest entry in "ready for cleaning" 
state
 Key: HIVE-25080
 URL: https://issues.apache.org/jira/browse/HIVE-25080
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


When a compaction txn commits, COMPACTION_QUEUE.CQ_COMMIT_TIME is updated with 
the current time. Then the compaction state is set to "ready for cleaning". 
(... and then the Cleaner runs and the state is set to "succeeded" hopefully)

Based on this we know (roughly) how long a compaction has been in state "ready 
for cleaning".

We should create a metric similar to compaction_oldest_enqueue_age_in_sec that 
would show that the cleaner is blocked by something i.e. find the compaction in 
"ready for cleaning" that has the oldest commit time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25079) Create new metric about number of writes to tables with manually disabled compaction

2021-04-30 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25079:
--

 Summary: Create new metric about number of writes to tables with 
manually disabled compaction
 Key: HIVE-25079
 URL: https://issues.apache.org/jira/browse/HIVE-25079
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Create a new metric that measures the number of writes tables that has 
compaction turned off manually. It does not matter if the write is committed or 
aborted (both are bad...)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25037) Create metric: Number of tables with > x aborts

2021-04-20 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25037:
--

 Summary: Create metric: Number of tables with > x aborts
 Key: HIVE-25037
 URL: https://issues.apache.org/jira/browse/HIVE-25037
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Create metric about number of tables with > x aborts.
x should be settable and default to 1500.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25021) Divide oldest_open_txn into oldest replication and non-replication transactions

2021-04-15 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25021:
--

 Summary: Divide oldest_open_txn into oldest replication and 
non-replication transactions
 Key: HIVE-25021
 URL: https://issues.apache.org/jira/browse/HIVE-25021
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


We should have different metrics (age and txn id) for 
oldest replication txn (TXN_TYPE==1)
oldest non-replication txn (TXN_TYPE!=1)
so recommendations can be tailored to the different cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25019) Rename metrics that have spaces in the name

2021-04-15 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25019:
--

 Summary: Rename metrics that have spaces in the name
 Key: HIVE-25019
 URL: https://issues.apache.org/jira/browse/HIVE-25019
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Metrics "num_compactions_ready for cleaning" and  "num_compactions_not 
initiated" contain spaces.

They should be renamed to "num_compactions_ready_for_cleaning" and 
"num_compactions_not_initiated" respectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25018) Create new metrics about Initiator / Cleaner failures

2021-04-15 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25018:
--

 Summary: Create new metrics about Initiator / Cleaner failures
 Key: HIVE-25018
 URL: https://issues.apache.org/jira/browse/HIVE-25018
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


2 new metrics should be defined:

Failed Initiator cycles
Failed Cleaner cycles
They should be measured as part of the error handling in the services, the lock 
timeout on AUX lock, should be ignored.
These should be RatioGauges (fail / success)
A RatioGauge implementation is available in the metrics package in common, a 
similar one should be created in the metastore. The common is build on top of 
MetricsVariable interface, where someone provides the metric from outside, in 
the metastore it should be done like the Gauge implementation, where the 
metrics class handles the AtomicIntegers



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25009) Compaction worker and initiator version check can cause NPE if the COMPACTION_QUEUE is empty

2021-04-13 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-25009:
--

 Summary: Compaction worker and initiator version check can cause 
NPE if the COMPACTION_QUEUE is empty
 Key: HIVE-25009
 URL: https://issues.apache.org/jira/browse/HIVE-25009
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24727) Cache hydration api in llap proto

2021-02-03 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24727:
--

 Summary: Cache hydration api in llap proto
 Key: HIVE-24727
 URL: https://issues.apache.org/jira/browse/HIVE-24727
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24729) Implement strategy for llap cache hydration

2021-02-03 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24729:
--

 Summary: Implement strategy for llap cache hydration
 Key: HIVE-24729
 URL: https://issues.apache.org/jira/browse/HIVE-24729
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24728) Low level reader for llap cache hydration

2021-02-03 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24728:
--

 Summary: Low level reader for llap cache hydration
 Key: HIVE-24728
 URL: https://issues.apache.org/jira/browse/HIVE-24728
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24726) Track required data for cache hydration

2021-02-03 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24726:
--

 Summary: Track required data for cache hydration
 Key: HIVE-24726
 URL: https://issues.apache.org/jira/browse/HIVE-24726
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24725) Collect top priority items from llap cache policy

2021-02-03 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24725:
--

 Summary: Collect top priority items from llap cache policy
 Key: HIVE-24725
 URL: https://issues.apache.org/jira/browse/HIVE-24725
 Project: Hive
  Issue Type: Sub-task
Reporter: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24722) LLAP cache hydration

2021-02-02 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24722:
--

 Summary: LLAP cache hydration
 Key: HIVE-24722
 URL: https://issues.apache.org/jira/browse/HIVE-24722
 Project: Hive
  Issue Type: Improvement
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Provide a way to save and reload the contents of the cache in the llap daemons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24653) Race condition between compactor marker generation and get splits

2021-01-18 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24653:
--

 Summary: Race condition between compactor marker generation and 
get splits
 Key: HIVE-24653
 URL: https://issues.apache.org/jira/browse/HIVE-24653
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.2
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


In a rear scenario it's possible that the compactor moved the files in the 
final location before creating the compactor marker, so it can be fetched by 
get splits before the marker is created.

2020-09-14 04:55:25,978 [ERROR] ORC_GET_SPLITS #4 |io.AcidUtils|: Failed to 
read 
hdfs://host/warehouse/tablespace/managed/hive/database.db/table/partition=x/base_0011535/_metadata_acid:
 No content to map to Object due to end of input
java.io.EOFException: No content to map to Object due to end of input




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24475) Generalize fixacidkeyindex utility

2020-12-03 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24475:
--

 Summary: Generalize fixacidkeyindex utility
 Key: HIVE-24475
 URL: https://issues.apache.org/jira/browse/HIVE-24475
 Project: Hive
  Issue Type: Improvement
  Components: ORC, Transactions
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


There is a utility in hive which can validate/fix corrupted hive.acid.key.index.
hive --service fixacidkeyindex
Unfortunately it is only tailored for a specific problem 
(https://issues.apache.org/jira/browse/HIVE-18907), instead of generally 
validating and recovering the hive.acid.key.index from the stripe data itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24293) Integer overflow in llap collision mask

2020-10-21 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-24293:
--

 Summary: Integer overflow in llap collision mask
 Key: HIVE-24293
 URL: https://issues.apache.org/jira/browse/HIVE-24293
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23847) Extracting hive-parser module broke exec jar upload in tez

2020-07-14 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-23847:
--

 Summary: Extracting hive-parser module broke exec jar upload in tez
 Key: HIVE-23847
 URL: https://issues.apache.org/jira/browse/HIVE-23847
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits


2020-07-13 16:53:50,551 [INFO] [Dispatcher thread {Central}] 
|HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1594632473849_0001_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 1, taskAttemptId=attempt_1594632473849_0001_1_00_00_0, 
creationTime=1594652027059, allocationTime=1594652028460, 
startTime=1594652029356, finishTime=1594652030546, timeTaken=1190, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, 
diagnostics=Error: Error while running task ( failure ) : 
attempt_1594632473849_0001_1_00_00_0:java.lang.RuntimeException: 
java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:340)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
... 16 more
Caused by: java.lang.NoClassDefFoundError: 
org/apache/hadoop/hive/ql/parse/ParseException
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.getDeclaredConstructor(Class.java:2178)
at 
org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:79)
at 
org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDTF(Registry.java:225)
at 
org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDTF(Registry.java:217)
at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.(FunctionRegistry.java:544)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:154)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.isConsistentWithinQuery(ExprNodeEvaluator.java:117)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:102)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEvals(ExprNodeEvaluatorFactory.java:76)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:69)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:359)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:548)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:368)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:506)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:303)
... 17 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.ql.parse.ParseException
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppC

[jira] [Created] (HIVE-23741) Store CacheTags in the file cache level

2020-06-22 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-23741:
--

 Summary: Store CacheTags in the file cache level
 Key: HIVE-23741
 URL: https://issues.apache.org/jira/browse/HIVE-23741
 Project: Hive
  Issue Type: Improvement
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


CacheTags are currently stored for every data buffer. The strings are 
internalized, but the number of cache tag objects can be reduced by moving them 
to the file cache level, and back referencing them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22992) ZkRegistryBase caching mechanism only caches the first instance

2020-03-06 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-22992:
--

 Summary: ZkRegistryBase caching mechanism only caches the first 
instance
 Key: HIVE-22992
 URL: https://issues.apache.org/jira/browse/HIVE-22992
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


ZkRegistryBase caching mechanism only caches the first instance of the llap 
node running on the same host.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22898) CharsetDecoder race condition in OrcRecordUpdater

2020-02-18 Thread Antal Sinkovits (Jira)
Antal Sinkovits created HIVE-22898:
--

 Summary: CharsetDecoder race condition in OrcRecordUpdater 
 Key: HIVE-22898
 URL: https://issues.apache.org/jira/browse/HIVE-22898
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Instances of CharsetDecoder are not thread safe, causing race condition in 
OrcRecordUpdater



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-21949) Revert HIVE-21232 LLAP: Add a cache-miss friendly split affinity provider

2019-07-03 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-21949:
--

 Summary: Revert HIVE-21232 LLAP: Add a cache-miss friendly split 
affinity provider
 Key: HIVE-21949
 URL: https://issues.apache.org/jira/browse/HIVE-21949
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits
 Attachments: HIVE-21949.01.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21610) Union operator can flow in the wrong stage causing NPE

2019-04-12 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-21610:
--

 Summary: Union operator can flow in the wrong stage causing NPE
 Key: HIVE-21610
 URL: https://issues.apache.org/jira/browse/HIVE-21610
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Because of HIVE-16227 it can happen that a UnionOperator will partially go into 
the wrong stage, because the currTask is changed, and the UnionOperator is 
reinitialized in GenMRFileSink1 with the wrong task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21570) Convert llap iomem servlets output to json format

2019-04-03 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-21570:
--

 Summary: Convert llap iomem servlets output to json format
 Key: HIVE-21570
 URL: https://issues.apache.org/jira/browse/HIVE-21570
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21315) Consolidate rawDataSize stat calculation

2019-02-25 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-21315:
--

 Summary: Consolidate rawDataSize stat calculation 
 Key: HIVE-21315
 URL: https://issues.apache.org/jira/browse/HIVE-21315
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Antal Sinkovits


RawDataSize statistics represents the table size, when loaded into memory. 
Sometimes this value is used to determine, whether a table should be used in a 
map join or not.
This value should probably be the same, regardless of the underlaying  file 
format used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21284) StatsWork should use footer scan for Parquet

2019-02-18 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-21284:
--

 Summary: StatsWork should use footer scan for Parquet
 Key: HIVE-21284
 URL: https://issues.apache.org/jira/browse/HIVE-21284
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21035) Race condition in SparkUtilities#getSparkSession

2018-12-12 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-21035:
--

 Summary: Race condition in SparkUtilities#getSparkSession
 Key: HIVE-21035
 URL: https://issues.apache.org/jira/browse/HIVE-21035
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


It can happen, that when in one given session, multiple queries are executed, 
that due to a race condition, multiple spark application master gets kicked off.
In this case, the one that started earlier, will not be killed, when the hive 
session closes, consuming resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20907) TestGetPartitionsUsingProjectionAndFilterSpecs is flaky

2018-11-12 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-20907:
--

 Summary: TestGetPartitionsUsingProjectionAndFilterSpecs is flaky
 Key: HIVE-20907
 URL: https://issues.apache.org/jira/browse/HIVE-20907
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Antal Sinkovits


private void verifyLocations(List origPartitions, StorageDescriptor 
sharedSD,
  List partitionWithoutSDS)

method expects, that the order of the two list are the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20904) Yetus fails to resolve module dependencies due to usage of exec plugin in metastore-server

2018-11-12 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-20904:
--

 Summary: Yetus fails to resolve module dependencies due to usage 
of exec plugin in metastore-server
 Key: HIVE-20904
 URL: https://issues.apache.org/jira/browse/HIVE-20904
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


metastore-server uses exec-maven-plugin to generate metastore-site.xml.template 
with ConfTemplatePrinter.
It expects some arguments. 
Because yetus also uses the exec-maven-plugin to determine the order of the 
modules to be built, but with zero params, the execution fails.
https://github.com/apache/yetus/blob/6ebaa1119e611db14f219e289e33ab8ac5c254a7/precommit/src/main/shell/test-patch.d/maven.sh#L658

Steps to reproduce the issue:
mvn -q exec:exec -Dexec.executable=pwd -Dexec.args=''



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20742) SparkSessionManagerImpl maintenance thread only cleans up session once

2018-10-13 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-20742:
--

 Summary: SparkSessionManagerImpl maintenance thread only cleans up 
session once
 Key: HIVE-20742
 URL: https://issues.apache.org/jira/browse/HIVE-20742
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


If there is a reconnect at the client session, the SparkSessionManagerImpl 
doesn't puts it back in the created sessions, so it will not time out the 
second time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-08-22 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-20440:
--

 Summary: Create better cache eviction policy for SmallTableCache
 Key: HIVE-20440
 URL: https://issues.apache.org/jira/browse/HIVE-20440
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


Enhance the SmallTableCache, to use guava cache with soft references, so that 
we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19486) Discrepancy between the config and the code in Hikari connectionPoolingType

2018-05-10 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-19486:
--

 Summary: Discrepancy between the config and the code in Hikari 
connectionPoolingType
 Key: HIVE-19486
 URL: https://issues.apache.org/jira/browse/HIVE-19486
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


MetaStoreConf contains datanucleus.connectionPoolingType "HikariCP" not 
"Hikari".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)