[jira] [Created] (HIVE-26285) Overwrite database metadata on original source in optimised failover.

2022-06-01 Thread Haymant Mangla (Jira)
Haymant Mangla created HIVE-26285:
-

 Summary: Overwrite database metadata on original source in 
optimised failover.
 Key: HIVE-26285
 URL: https://issues.apache.org/jira/browse/HIVE-26285
 Project: Hive
  Issue Type: Bug
Reporter: Haymant Mangla
Assignee: Haymant Mangla






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26284) ClassCastException: java.io.PushbackInputStream cannot be cast to org.apache.hadoop.fs.Seekable when table properties contains 'skip.header.line.count' = '1' and datafile

2022-06-01 Thread Nikhil Gupta (Jira)
Nikhil Gupta created HIVE-26284:
---

 Summary: ClassCastException: java.io.PushbackInputStream cannot be 
cast to org.apache.hadoop.fs.Seekable when table properties contains 
'skip.header.line.count' = '1' and datafiles are in UTF-16 encoding
 Key: HIVE-26284
 URL: https://issues.apache.org/jira/browse/HIVE-26284
 Project: Hive
  Issue Type: Bug
Reporter: Nikhil Gupta


{noformat}
ERROR : Vertex failed, vertexName=Map 4, 
vertexId=vertex_1648118653114_0507_2_00, diagnostics=[Vertex 
vertex_1648118653114_0507_2_00 [Map 4] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input:  initializer failed, 
vertex=vertex_1648118653114_0507_2_00 [Map 4], java.lang.ClassCastException: 
java.io.PushbackInputStream cannot be cast to org.apache.hadoop.fs.Seekable
  at 
org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:78)
  at 
org.apache.hadoop.hive.ql.io.SkippingTextInputFormat.getCachedStartIndex(SkippingTextInputFormat.java:120)
  at 
org.apache.hadoop.hive.ql.io.SkippingTextInputFormat.makeSplitInternal(SkippingTextInputFormat.java:73)
  at 
org.apache.hadoop.hive.ql.io.SkippingTextInputFormat.makeSplit(SkippingTextInputFormat.java:66)
  at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:379)
  at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:532)
  at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:789)
  at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
  at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
  at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
  at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:422)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1732)
  at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
  at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253){noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26283) Need better decision making for creating SortedDynPartitionOptimizer

2022-06-01 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-26283:
---

 Summary: Need better decision making for creating 
SortedDynPartitionOptimizer
 Key: HIVE-26283
 URL: https://issues.apache.org/jira/browse/HIVE-26283
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Reporter: Steve Carlin


When the hive.optimize.sort.dynamic.partition.threshold param is set to 0, the 
optimizer decides whether to create the SortedDynPartitionOptimizer class.  

In production, we've seen this making the wrong decision when there is a simple 
INSERT..SELECT into a partitioned table and the data being inserted is skewed 
towards one partition. 

In this case, it still is creating the SortedDynPartitionOptimizer.  This 
forces a reducer step and all the data gets sent to the same reducer.

In order to reproduce this, you may also have to turn off "autogather" stats 
since this also will create a reducer step.

What we ultimately want is just a mapper step so the load is evenly distributed 
across the mappers.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26282) Improve iceberg CTAS error message for unsupported types

2022-06-01 Thread Jira
László Pintér created HIVE-26282:


 Summary: Improve iceberg CTAS error message for unsupported types
 Key: HIVE-26282
 URL: https://issues.apache.org/jira/browse/HIVE-26282
 Project: Hive
  Issue Type: Improvement
Reporter: László Pintér
Assignee: László Pintér


When running a CTAS query using a hive table that has a tinyint, smallint, 
varchar or char column it fails with an "Unsupported Hive type" error message. 
This can be worked around if the  'iceberg.mr.schema.auto.conversion' property 
is set to true on session level. We should communicate this possibility when 
raising the exception. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26281) Missing statistics when requesting partition by names via HS2

2022-06-01 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-26281:
--

 Summary: Missing statistics when requesting partition by names via 
HS2
 Key: HIVE-26281
 URL: https://issues.apache.org/jira/browse/HIVE-26281
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis


[Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155]
 method can be used to obtain partition objects from the metastore by 
specifying their names and other options.

{code:java}
public List getPartitionsByNames(Table tbl, List partNames, 
boolean getColStats){code}

However, the partition statistics are missing from the returned objects no 
matter the value of the {{getColStats}} parameter.

The problem is 
[here|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4174]
 and was caused by HIVE-24743.




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26280) Copy more data into COMPLETED_COMPACTIONS for better supportability

2022-06-01 Thread Karen Coppage (Jira)
Karen Coppage created HIVE-26280:


 Summary: Copy more data into COMPLETED_COMPACTIONS for better 
supportability
 Key: HIVE-26280
 URL: https://issues.apache.org/jira/browse/HIVE-26280
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Reporter: Karen Coppage
Assignee: Karen Coppage


There is some information in COMPACTION_QUEUE that doesn't get copied over to 
COMPLETED_COMPACTIONS when compaction completes. It would help with 
supportability if COMPLETED_COMPACTIONS (and especially the view of it in the 
SYS database) also contained this information.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26279) Drop unused requests from TestHiveMetaStoreClientApiArgumentsChecker

2022-06-01 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-26279:
--

 Summary: Drop unused requests from 
TestHiveMetaStoreClientApiArgumentsChecker
 Key: HIVE-26279
 URL: https://issues.apache.org/jira/browse/HIVE-26279
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis


Some tests in TestHiveMetaStoreClientApiArgumentsChecker are creating a request 
but not really using them so it is basically dead code that can be removed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching

2022-06-01 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-26278:
--

 Summary: Add unit tests for Hive#getPartitionsByNames using 
batching
 Key: HIVE-26278
 URL: https://issues.apache.org/jira/browse/HIVE-26278
 Project: Hive
  Issue Type: Task
  Components: HiveServer2
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis


[Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155]
 supports decomposing requests in batches but there are no unit tests checking 
for the ValidWriteIdList when batching is used.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26277) Add unit tests for ColumnStatsAggregator classes

2022-06-01 Thread Alessandro Solimando (Jira)
Alessandro Solimando created HIVE-26277:
---

 Summary: Add unit tests for ColumnStatsAggregator classes
 Key: HIVE-26277
 URL: https://issues.apache.org/jira/browse/HIVE-26277
 Project: Hive
  Issue Type: Test
  Components: Statistics, Tests
Affects Versions: 4.0.0-alpha-2
Reporter: Alessandro Solimando
Assignee: Alessandro Solimando


We have no unit tests covering these classes, which also happen to contain some 
complicated logic, making the absence of tests even more risky.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)