[jira] [Work logged] (HIVE-25975) Optimize ClusteredWriter for bucketed Iceberg tables

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25975?focusedWorklogId=735789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735789
 ]

ASF GitHub Bot logged work on HIVE-25975:
-

Author: ASF GitHub Bot
Created on: 03/Mar/22 06:30
Start Date: 03/Mar/22 06:30
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #3060:
URL: https://github.com/apache/hive/pull/3060#discussion_r818351392



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergOutputCommitter.java
##
@@ -77,7 +77,7 @@
   );
 
   private static final PartitionSpec PARTITIONED_SPEC =
-  PartitionSpec.builderFor(CUSTOMER_SCHEMA).bucket("customer_id", 
3).build();
+  
PartitionSpec.builderFor(CUSTOMER_SCHEMA).identity("customer_id").build();

Review comment:
   I remember that you have already told me why is this change needed. 
Could you please refresh my memory? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735789)
Time Spent: 5h 40m  (was: 5.5h)

> Optimize ClusteredWriter for bucketed Iceberg tables
> 
>
> Key: HIVE-25975
> URL: https://issues.apache.org/jira/browse/HIVE-25975
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> The first version of the ClusteredWriter in Hive-Iceberg will be lenient for 
> bucketed tables: i.e. the records do not need to be ordered by the bucket 
> values, the writer will just close its current file and open a new one for 
> out-of-order records. 
> This is suboptimal for the long-term due to creating many small files. Spark 
> uses a UDF to compute the bucket value for each record and therefore it is 
> able to order the records by bucket values, achieving optimal clustering.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25975) Optimize ClusteredWriter for bucketed Iceberg tables

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25975?focusedWorklogId=735786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735786
 ]

ASF GitHub Bot logged work on HIVE-25975:
-

Author: ASF GitHub Bot
Created on: 03/Mar/22 06:25
Start Date: 03/Mar/22 06:25
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #3060:
URL: https://github.com/apache/hive/pull/3060#discussion_r818349170



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/GenericUDFIcebergBucket.java
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.nio.ByteBuffer;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableConstantIntObjectInspector;
+import org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.DoubleWritable;
+import org.apache.hadoop.io.FloatWritable;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.iceberg.transforms.Transform;
+import org.apache.iceberg.transforms.Transforms;
+import org.apache.iceberg.types.Type;
+import org.apache.iceberg.types.Types;
+
+/**
+ * GenericUDFIcebergBucket - UDF that wraps around Iceberg's bucket transform 
function
+ */
+@Description(name = "iceberg_bucket",
+value = "_FUNC_(value, bucketCount) - " +
+"Returns the bucket value calculated by Iceberg bucket transform 
function ",
+extended = "Example:\n  > SELECT _FUNC_('A bucket full of ice!', 5);\n  4")
+public class GenericUDFIcebergBucket extends GenericUDF {
+  private final IntWritable result = new IntWritable();
+  private int numBuckets = -1;
+  private transient PrimitiveObjectInspector argumentOI;
+  private transient ObjectInspectorConverters.Converter converter;
+
+  @FunctionalInterface
+  private interface UDFEvalFunction {
+void apply(T argument) throws HiveException;
+  }
+
+  private transient UDFEvalFunction evaluator;
+
+  @Override
+  public ObjectInspector initialize(ObjectInspector[] arguments) throws 
UDFArgumentException {
+if (arguments.length != 2) {
+  throw new UDFArgumentLengthException(
+  "ICEBERG_BUCKET requires 2 arguments (value, bucketCount), but got " 
+ arguments.length);
+}
+
+if (arguments[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
+  throw new UDFArgumentException(
+  "ICEBERG_BUCKET first argument takes primitive types, got " + 
argumentOI.getTypeName());
+}
+argumentOI = (PrimitiveObjectInspector) arguments[0];
+
+PrimitiveObjectInspector.PrimitiveCategory inputType = 
argumentOI.getPrimitiveCategory();
+ObjectInspector outputOI = null;
+switch (inputType) {
+  case CHAR:
+  case VARCHAR:
+  case STRING:
+converter = new 
PrimitiveObjectInspectorConverter.StringConverter(argumentOI);
+evaluator = arg -> {
+  String val = (String) converter.convert(arg.get());
+  applyBucketTransform(val, Types.StringType.get());
+};
+break;
+
+  case BINARY:
+converter = new 
PrimitiveObjectInspectorConverter.BinaryConverter(argumentOI,
+

[jira] [Commented] (HIVE-13384) Failed to create HiveMetaStoreClient object with proxy user when Kerberos enabled

2022-03-02 Thread Bo Cui (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-13384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500443#comment-17500443
 ] 

Bo Cui commented on HIVE-13384:
---

I think it's better to solve the issue from HiveMetaStoreClient, WDYT?:)

> Failed to create HiveMetaStoreClient object with proxy user when Kerberos 
> enabled
> -
>
> Key: HIVE-13384
> URL: https://issues.apache.org/jira/browse/HIVE-13384
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I wrote a Java client to talk with HiveMetaStore. (Hive 1.2.0)
> But found that it can't new a HiveMetaStoreClient object successfully via a 
> proxy user in Kerberos env.
> ===
> 15/10/13 00:14:38 ERROR transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> ==
> When I debugging on Hive, I found that the error came from open() method in 
> HiveMetaStoreClient class.
> Around line 406,
>  transport = UserGroupInformation.getCurrentUser().doAs(new 
> PrivilegedExceptionAction() {  //FAILED, because the current user 
> doesn't have the cridential
> But it will work if I change above line to
>  transport = UserGroupInformation.getCurrentUser().getRealUser().doAs(new 
> PrivilegedExceptionAction() {  //PASS
> I found DRILL-3413 fixes this error in Drill side as a workaround. But if I 
> submit a mapreduce job via Pig/HCatalog, it runs into the same issue again 
> when initialize the object via HCatalog.
> It would be better to fix this issue in Hive side.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26000) DirectSQL to prune partitions fails with postgres backend for Skewed-Partition tables

2022-03-02 Thread Naresh P R (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-26000:
--
Summary: DirectSQL to prune partitions fails with postgres backend for 
Skewed-Partition tables  (was: DirectSQL to pruning partitions fails with 
postgres backend for Skewed-Partition tables)

> DirectSQL to prune partitions fails with postgres backend for 
> Skewed-Partition tables
> -
>
> Key: HIVE-26000
> URL: https://issues.apache.org/jira/browse/HIVE-26000
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
>  
> {code:java}
> 2022-03-02 20:37:56,421 INFO  
> org.apache.hadoop.hive.metastore.PartFilterExprUtil: [pool-6-thread-200]: 
> Unable to make the expression tree from expression string [((ds = 
> '2008-04-08') and (UDFToDouble(hr) = 11.0D))]Error parsing partition filter; 
> lexer error: null; exception NoViableAltException(24@[])
> 2022-03-02 20:37:56,593 WARN  org.apache.hadoop.hive.metastore.ObjectStore: 
> [pool-6-thread-200]: Falling back to ORM path due to direct SQL failure (this 
> is not an error): Error executing SQL query "select 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID", 
> "SKEWED_STRING_LIST_VALUES".STRING_LIST_ID, 
> "SKEWED_COL_VALUE_LOC_MAP"."LOCATION", 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_VALUE" from 
> "SKEWED_COL_VALUE_LOC_MAP"  left outer join "SKEWED_STRING_LIST_VALUES" on 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" = 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" where 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" in (51010)  and 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" is not null order by 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."INTEGER_IDX" asc". at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at 
> org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:131)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.setSkewedColLocationMaps(MetastoreDirectSqlUtils.java:414)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:967)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:788)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:117)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:530)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:521)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$10.getSqlResult(ObjectStore.java:3722);
>  Caused by: ERROR: column SKEWED_STRING_LIST_VALUES.string_list_id does not 
> exist
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26000) DirectSQL to pruning partitions fails with postgres backend for Skewed-Partition tables

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26000?focusedWorklogId=735609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735609
 ]

ASF GitHub Bot logged work on HIVE-26000:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 21:38
Start Date: 02/Mar/22 21:38
Worklog Time Spent: 10m 
  Work Description: nareshpr opened a new pull request #3073:
URL: https://github.com/apache/hive/pull/3073


   ### What changes were proposed in this pull request?
   PartitionPruning via directSql is failing in postgres db for skewed tables
   
   ### Why are the changes needed?
   Fallback to ORM is taking long time
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   We already have a testcase covering the issue (list_bucket_dml_4.q), it 
happens in postgres backend db.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735609)
Remaining Estimate: 0h
Time Spent: 10m

> DirectSQL to pruning partitions fails with postgres backend for 
> Skewed-Partition tables
> ---
>
> Key: HIVE-26000
> URL: https://issues.apache.org/jira/browse/HIVE-26000
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
>  
> {code:java}
> 2022-03-02 20:37:56,421 INFO  
> org.apache.hadoop.hive.metastore.PartFilterExprUtil: [pool-6-thread-200]: 
> Unable to make the expression tree from expression string [((ds = 
> '2008-04-08') and (UDFToDouble(hr) = 11.0D))]Error parsing partition filter; 
> lexer error: null; exception NoViableAltException(24@[])
> 2022-03-02 20:37:56,593 WARN  org.apache.hadoop.hive.metastore.ObjectStore: 
> [pool-6-thread-200]: Falling back to ORM path due to direct SQL failure (this 
> is not an error): Error executing SQL query "select 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID", 
> "SKEWED_STRING_LIST_VALUES".STRING_LIST_ID, 
> "SKEWED_COL_VALUE_LOC_MAP"."LOCATION", 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_VALUE" from 
> "SKEWED_COL_VALUE_LOC_MAP"  left outer join "SKEWED_STRING_LIST_VALUES" on 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" = 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" where 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" in (51010)  and 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" is not null order by 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."INTEGER_IDX" asc". at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at 
> org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:131)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.setSkewedColLocationMaps(MetastoreDirectSqlUtils.java:414)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:967)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:788)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:117)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:530)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:521)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$10.getSqlResult(ObjectStore.java:3722);
>  Caused by: ERROR: column SKEWED_STRING_LIST_VALUES.string_list_id does not 
> exist
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26000) DirectSQL to pruning partitions fails with postgres backend for Skewed-Partition tables

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26000:
--
Labels: pull-request-available  (was: )

> DirectSQL to pruning partitions fails with postgres backend for 
> Skewed-Partition tables
> ---
>
> Key: HIVE-26000
> URL: https://issues.apache.org/jira/browse/HIVE-26000
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
>  
> {code:java}
> 2022-03-02 20:37:56,421 INFO  
> org.apache.hadoop.hive.metastore.PartFilterExprUtil: [pool-6-thread-200]: 
> Unable to make the expression tree from expression string [((ds = 
> '2008-04-08') and (UDFToDouble(hr) = 11.0D))]Error parsing partition filter; 
> lexer error: null; exception NoViableAltException(24@[])
> 2022-03-02 20:37:56,593 WARN  org.apache.hadoop.hive.metastore.ObjectStore: 
> [pool-6-thread-200]: Falling back to ORM path due to direct SQL failure (this 
> is not an error): Error executing SQL query "select 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID", 
> "SKEWED_STRING_LIST_VALUES".STRING_LIST_ID, 
> "SKEWED_COL_VALUE_LOC_MAP"."LOCATION", 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_VALUE" from 
> "SKEWED_COL_VALUE_LOC_MAP"  left outer join "SKEWED_STRING_LIST_VALUES" on 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" = 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" where 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" in (51010)  and 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" is not null order by 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."INTEGER_IDX" asc". at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at 
> org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:131)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.setSkewedColLocationMaps(MetastoreDirectSqlUtils.java:414)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:967)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:788)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:117)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:530)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:521)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$10.getSqlResult(ObjectStore.java:3722);
>  Caused by: ERROR: column SKEWED_STRING_LIST_VALUES.string_list_id does not 
> exist
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26000) DirectSQL to pruning partitions fails with postgres backend for Skewed-Partition tables

2022-03-02 Thread Naresh P R (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-26000:
--
Summary: DirectSQL to pruning partitions fails with postgres backend for 
Skewed-Partition tables  (was: Partition table with Skew columns, DirectSQL to 
pruning partitions fails with Postgres backend)

> DirectSQL to pruning partitions fails with postgres backend for 
> Skewed-Partition tables
> ---
>
> Key: HIVE-26000
> URL: https://issues.apache.org/jira/browse/HIVE-26000
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>
>  
>  
> {code:java}
> 2022-03-02 20:37:56,421 INFO  
> org.apache.hadoop.hive.metastore.PartFilterExprUtil: [pool-6-thread-200]: 
> Unable to make the expression tree from expression string [((ds = 
> '2008-04-08') and (UDFToDouble(hr) = 11.0D))]Error parsing partition filter; 
> lexer error: null; exception NoViableAltException(24@[])
> 2022-03-02 20:37:56,593 WARN  org.apache.hadoop.hive.metastore.ObjectStore: 
> [pool-6-thread-200]: Falling back to ORM path due to direct SQL failure (this 
> is not an error): Error executing SQL query "select 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID", 
> "SKEWED_STRING_LIST_VALUES".STRING_LIST_ID, 
> "SKEWED_COL_VALUE_LOC_MAP"."LOCATION", 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_VALUE" from 
> "SKEWED_COL_VALUE_LOC_MAP"  left outer join "SKEWED_STRING_LIST_VALUES" on 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" = 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" where 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" in (51010)  and 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" is not null order by 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."INTEGER_IDX" asc". at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at 
> org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:131)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.setSkewedColLocationMaps(MetastoreDirectSqlUtils.java:414)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:967)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:788)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:117)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:530)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:521)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$10.getSqlResult(ObjectStore.java:3722);
>  Caused by: ERROR: column SKEWED_STRING_LIST_VALUES.string_list_id does not 
> exist
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-26000) Partition table with Skew columns, DirectSQL to pruning partitions fails with Postgres backend

2022-03-02 Thread Naresh P R (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R reassigned HIVE-26000:
-


> Partition table with Skew columns, DirectSQL to pruning partitions fails with 
> Postgres backend
> --
>
> Key: HIVE-26000
> URL: https://issues.apache.org/jira/browse/HIVE-26000
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>
>  
>  
> {code:java}
> 2022-03-02 20:37:56,421 INFO  
> org.apache.hadoop.hive.metastore.PartFilterExprUtil: [pool-6-thread-200]: 
> Unable to make the expression tree from expression string [((ds = 
> '2008-04-08') and (UDFToDouble(hr) = 11.0D))]Error parsing partition filter; 
> lexer error: null; exception NoViableAltException(24@[])
> 2022-03-02 20:37:56,593 WARN  org.apache.hadoop.hive.metastore.ObjectStore: 
> [pool-6-thread-200]: Falling back to ORM path due to direct SQL failure (this 
> is not an error): Error executing SQL query "select 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID", 
> "SKEWED_STRING_LIST_VALUES".STRING_LIST_ID, 
> "SKEWED_COL_VALUE_LOC_MAP"."LOCATION", 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_VALUE" from 
> "SKEWED_COL_VALUE_LOC_MAP"  left outer join "SKEWED_STRING_LIST_VALUES" on 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" = 
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" where 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" in (51010)  and 
> "SKEWED_COL_VALUE_LOC_MAP"."STRING_LIST_ID_KID" is not null order by 
> "SKEWED_COL_VALUE_LOC_MAP"."SD_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."STRING_LIST_ID" asc,  
> "SKEWED_STRING_LIST_VALUES"."INTEGER_IDX" asc". at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at 
> org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:131)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.loopJoinOrderedResult(MetastoreDirectSqlUtils.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.setSkewedColLocationMaps(MetastoreDirectSqlUtils.java:414)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:967)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:788)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:117)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:530)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:521)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$10.getSqlResult(ObjectStore.java:3722);
>  Caused by: ERROR: column SKEWED_STRING_LIST_VALUES.string_list_id does not 
> exist
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25963) Temporary table creation with not null constraint gets converted to external table

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25963?focusedWorklogId=735494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735494
 ]

ASF GitHub Bot logged work on HIVE-25963:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 18:48
Start Date: 02/Mar/22 18:48
Worklog Time Spent: 10m 
  Work Description: sourabh912 commented on pull request #3040:
URL: https://github.com/apache/hive/pull/3040#issuecomment-1057265718


   @yongzhi @harishjp @saihemanth-cloudera @hsnusonic  : Would appreciate 
feedback from you as well ! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735494)
Time Spent: 1h  (was: 50m)

> Temporary table creation with not null constraint gets converted to external 
> table 
> ---
>
> Key: HIVE-25963
> URL: https://issues.apache.org/jira/browse/HIVE-25963
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Sourabh Goyal
>Assignee: Sourabh Goyal
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When creating a temporary table with not null, constraint it gets covered to 
> external table. For example: 
> create temporary table t2 (a int not null);
> table t2' metadata looks like: 
> {code:java}
> +---+++
> |   col_name| data_type   
>|  comment   |
> +---+++
> | a | int 
>||
> |   | NULL
>| NULL   |
> | # Detailed Table Information  | NULL
>| NULL   |
> | Database: | default 
>| NULL   |
> | OwnerType:| USER
>| NULL   |
> | Owner:| sourabh 
>| NULL   |
> | CreateTime:   | Tue Feb 15 15:20:13 PST 2022
>| NULL   |
> | LastAccessTime:   | UNKNOWN 
>| NULL   |
> | Retention:| 0   
>| NULL   |
> | Location: | 
> hdfs://localhost:9000/tmp/hive/sourabh/80d374a8-cd7a-4fcf-ae72-51b04ff9c3d8/_tmp_space.db/4574446d-c144-48f9-b4b6-2e9ee0ce5be4
>  | NULL   |
> | Table Type:   | EXTERNAL_TABLE  
>| NULL   |
> | Table Parameters: | NULL
>| NULL   |
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\"}} |
> |   | EXTERNAL
>| TRUE   |
> |   | TRANSLATED_TO_EXTERNAL  
>| TRUE   |
> |   | bucketing_version   
>| 2  |
> |   | external.table.purge
>| TRUE   |
> |   | numFiles
>| 0

[jira] [Updated] (HIVE-25935) Cleanup IMetaStoreClient#getPartitionsByNames APIs

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25935:
--
Labels: pull-request-available  (was: )

> Cleanup IMetaStoreClient#getPartitionsByNames APIs
> --
>
> Key: HIVE-25935
> URL: https://issues.apache.org/jira/browse/HIVE-25935
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Stamatis Zampetakis
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the 
> [IMetastoreClient|https://github.com/apache/hive/blob/4b7a948e45fd88372fef573be321cda40d189cc7/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java]
>  interface has 8 variants of the {{getPartitionsByNames}} method. Going 
> quickly over the concrete implementation it appears that not all of them are 
> useful/necessary so a bit of cleanup is needed.
> Below a few potential problems I observed:
> * Some of the APIs are not used anywhere in the project (neither by 
> production nor by test code).
> * Some of the APIs are deprecated in some concrete implementations but not 
> globally at the interface level without an explanation why.
> * Some of the implementations simply throw without doing anything.
> * Many of the APIs are partially tested or not tested at all.
> HIVE-24743, HIVE-25281 are related since they introduce/deprecate some of the 
> aforementioned APIs.
> It would be good to review the aforementioned APIs and decide what needs to 
> stay and what needs to go as well as complete necessary when relevant.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25935) Cleanup IMetaStoreClient#getPartitionsByNames APIs

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25935?focusedWorklogId=735418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735418
 ]

ASF GitHub Bot logged work on HIVE-25935:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 17:13
Start Date: 02/Mar/22 17:13
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #3072:
URL: https://github.com/apache/hive/pull/3072


   ### What changes were proposed in this pull request?
   Cleans up deprecated methods
   
   ### Why are the changes needed?
   We do not want to release deprecated methods
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Unit tests will show errors, if any


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735418)
Remaining Estimate: 0h
Time Spent: 10m

> Cleanup IMetaStoreClient#getPartitionsByNames APIs
> --
>
> Key: HIVE-25935
> URL: https://issues.apache.org/jira/browse/HIVE-25935
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Stamatis Zampetakis
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the 
> [IMetastoreClient|https://github.com/apache/hive/blob/4b7a948e45fd88372fef573be321cda40d189cc7/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java]
>  interface has 8 variants of the {{getPartitionsByNames}} method. Going 
> quickly over the concrete implementation it appears that not all of them are 
> useful/necessary so a bit of cleanup is needed.
> Below a few potential problems I observed:
> * Some of the APIs are not used anywhere in the project (neither by 
> production nor by test code).
> * Some of the APIs are deprecated in some concrete implementations but not 
> globally at the interface level without an explanation why.
> * Some of the implementations simply throw without doing anything.
> * Many of the APIs are partially tested or not tested at all.
> HIVE-24743, HIVE-25281 are related since they introduce/deprecate some of the 
> aforementioned APIs.
> It would be good to review the aforementioned APIs and decide what needs to 
> stay and what needs to go as well as complete necessary when relevant.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25977) Enhance Compaction Cleaner to skip when there is nothing to do #2

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25977?focusedWorklogId=735348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735348
 ]

ASF GitHub Bot logged work on HIVE-25977:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 15:13
Start Date: 02/Mar/22 15:13
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #2971:
URL: https://github.com/apache/hive/pull/2971#discussion_r817789462



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -439,29 +440,40 @@ private boolean removeFiles(String location, 
ValidWriteIdList writeIdList, Compa
 return success;
   }
 
-  private boolean hasDataBelowWatermark(FileSystem fs, Path path, long 
highWatermark) throws IOException {
-FileStatus[] children = fs.listStatus(path);
+  private boolean hasDataBelowWatermark(AcidDirectory acidDir, FileSystem fs, 
Path path, long highWatermark,
+  long minOpenTxn)
+  throws IOException {
+Set acidPaths = new HashSet<>();
+for (ParsedDelta delta : acidDir.getCurrentDirectories()) {
+  acidPaths.add(delta.getPath());
+}
+if (acidDir.getBaseDirectory() != null) {
+  acidPaths.add(acidDir.getBaseDirectory());
+}
+FileStatus[] children = fs.listStatus(path, p -> {
+  return !acidPaths.contains(p);
+});
 for (FileStatus child : children) {
-  if (isFileBelowWatermark(child, highWatermark)) {
+  if (isFileBelowWatermark(child, highWatermark, minOpenTxn)) {
 return true;
   }
 }
 return false;
   }
 
-  private boolean isFileBelowWatermark(FileStatus child, long highWatermark) {
+  private boolean isFileBelowWatermark(FileStatus child, long highWatermark, 
long minOpenTxn) {
 Path p = child.getPath();
 String fn = p.getName();
 if (!child.isDirectory()) {
-  return false;
+  return true;
 }
 if (fn.startsWith(AcidUtils.BASE_PREFIX)) {
   ParsedBaseLight b = ParsedBaseLight.parseBase(p);
-  return b.getWriteId() < highWatermark;
+  return b.getWriteId() <= highWatermark && b.getVisibilityTxnId() <= 
minOpenTxn;

Review comment:
   I can't remember if we discussed this. What if the table has files:
   ```
   base_2_v5
   delta_3_3
   delta_4_4
   base_4_v16 (this compaction)
   ``` 
   
   
   hwm=4
   
   Say the worker is older and did not populate CQ_NEXT_TXN_ID, so the cleaner 
will pick up any compaction in "ready for cleaning"
   Say minTxnId=13
   The AcidDirectory will contain:
   ```
   base_2_v5
   delta_3_3
   delta_4_4
   ```
   
   So isFileBelowWatermark will investigate:
   ```
   base_4_v16
   ```
   And return false because 4=b.getWriteId() <= highWatermark=4 is true and 
16=b.getVisibilityTxnId() <= minOpenTxn=13 is false.
   
   
   Also I probably have amnesia but can you remind me of the rationale behind 
`b.getVisibilityTxnId() <= minOpenTxn` here?
   Also keep in mind that the visibilityTxnId is always 0 unless the delta/base 
is a product of compaction.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735348)
Time Spent: 40m  (was: 0.5h)

> Enhance Compaction Cleaner to skip when there is nothing to do #2
> -
>
> Key: HIVE-25977
> URL: https://issues.apache.org/jira/browse/HIVE-25977
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> initially this was just an addendum to the original patch ; but got delayed 
> and altered - so it should have its own ticket



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25977) Enhance Compaction Cleaner to skip when there is nothing to do #2

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25977?focusedWorklogId=735347=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735347
 ]

ASF GitHub Bot logged work on HIVE-25977:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 15:12
Start Date: 02/Mar/22 15:12
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #2971:
URL: https://github.com/apache/hive/pull/2971#discussion_r817789462



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -439,29 +440,40 @@ private boolean removeFiles(String location, 
ValidWriteIdList writeIdList, Compa
 return success;
   }
 
-  private boolean hasDataBelowWatermark(FileSystem fs, Path path, long 
highWatermark) throws IOException {
-FileStatus[] children = fs.listStatus(path);
+  private boolean hasDataBelowWatermark(AcidDirectory acidDir, FileSystem fs, 
Path path, long highWatermark,
+  long minOpenTxn)
+  throws IOException {
+Set acidPaths = new HashSet<>();
+for (ParsedDelta delta : acidDir.getCurrentDirectories()) {
+  acidPaths.add(delta.getPath());
+}
+if (acidDir.getBaseDirectory() != null) {
+  acidPaths.add(acidDir.getBaseDirectory());
+}
+FileStatus[] children = fs.listStatus(path, p -> {
+  return !acidPaths.contains(p);
+});
 for (FileStatus child : children) {
-  if (isFileBelowWatermark(child, highWatermark)) {
+  if (isFileBelowWatermark(child, highWatermark, minOpenTxn)) {
 return true;
   }
 }
 return false;
   }
 
-  private boolean isFileBelowWatermark(FileStatus child, long highWatermark) {
+  private boolean isFileBelowWatermark(FileStatus child, long highWatermark, 
long minOpenTxn) {
 Path p = child.getPath();
 String fn = p.getName();
 if (!child.isDirectory()) {
-  return false;
+  return true;
 }
 if (fn.startsWith(AcidUtils.BASE_PREFIX)) {
   ParsedBaseLight b = ParsedBaseLight.parseBase(p);
-  return b.getWriteId() < highWatermark;
+  return b.getWriteId() <= highWatermark && b.getVisibilityTxnId() <= 
minOpenTxn;

Review comment:
   I can't remember if we discussed this. What if the table has files:
   ```
   base_2_v5
   delta_3_3
   delta_4_4
   base_4_v16
   ``` (this compaction)
   
   hwm=4
   
   Say the worker is older and did not populate CQ_NEXT_TXN_ID, so the cleaner 
will pick up any compaction in "ready for cleaning"
   Say minTxnId=13
   The AcidDirectory will contain:
   base_2_v5
   delta_3_3
   delta_4_4
   
   So isFileBelowWatermark will investigate:
   base_4_v16
   And return false because 4=b.getWriteId() <= highWatermark=4 is true and 
16=b.getVisibilityTxnId() <= minOpenTxn=13 is false.
   
   
   Also I probably have amnesia but can you remind me of the rationale behind 
`b.getVisibilityTxnId() <= minOpenTxn` here?
   Also keep in mind that the visibilityTxnId is always 0 unless the delta/base 
is a product of compaction.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735347)
Time Spent: 0.5h  (was: 20m)

> Enhance Compaction Cleaner to skip when there is nothing to do #2
> -
>
> Key: HIVE-25977
> URL: https://issues.apache.org/jira/browse/HIVE-25977
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> initially this was just an addendum to the original patch ; but got delayed 
> and altered - so it should have its own ticket



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25994) Analyze table runs into ClassNotFoundException-s in case binary distribution is used

2022-03-02 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-25994:
---

Assignee: Zoltan Haindrich

> Analyze table runs into ClassNotFoundException-s in case binary distribution 
> is used
> 
>
> Key: HIVE-25994
> URL: https://issues.apache.org/jira/browse/HIVE-25994
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> any nightly release can be used to reproduce this:
> {code}
> create table t (a integer); insert into t values (1) ; analyze table t 
> compute statistics for columns;
> {code}
> results in
> {code}
> Caused by: java.lang.NoClassDefFoundError: org/antlr/runtime/tree/CommonTree
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
> at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> at java.lang.Class.getDeclaredConstructors0(Native Method)
> at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
> at java.lang.Class.getConstructor0(Class.java:3075)
> at java.lang.Class.getDeclaredConstructor(Class.java:2178)
> at 
> org.apache.hive.com.esotericsoftware.reflectasm.ConstructorAccess.get(ConstructorAccess.java:65)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultInstantiatorStrategy.newInstantiatorOf(DefaultInstantiatorStrategy.java:60)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.newInstantiator(Kryo.java:1119)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1128)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:153)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:118)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:729)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ReflectField.read(ReflectField.java:125)
> ... 38 more
> Caused by: java.lang.ClassNotFoundException: org.antlr.runtime.tree.CommonTree
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25986) Statement id is incorrect in case of load in path to MM table

2022-03-02 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25986:
---
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks for the review [~pvary]

> Statement id is incorrect in case of load in path to MM table
> -
>
> Key: HIVE-25986
> URL: https://issues.apache.org/jira/browse/HIVE-25986
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25986) Statement id is incorrect in case of load in path to MM table

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25986?focusedWorklogId=735334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735334
 ]

ASF GitHub Bot logged work on HIVE-25986:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 14:48
Start Date: 02/Mar/22 14:48
Worklog Time Spent: 10m 
  Work Description: asinkovits merged pull request #3055:
URL: https://github.com/apache/hive/pull/3055


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735334)
Time Spent: 40m  (was: 0.5h)

> Statement id is incorrect in case of load in path to MM table
> -
>
> Key: HIVE-25986
> URL: https://issues.apache.org/jira/browse/HIVE-25986
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: ACID, pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25935) Cleanup IMetaStoreClient#getPartitionsByNames APIs

2022-03-02 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500210#comment-17500210
 ] 

Zoltán Borók-Nagy commented on HIVE-25935:
--

Hi [~pvary],

Thanks for reaching out. In Impala we are using the followings from 
IMetastoreClient:
* {{getPartitionsByNames(String, String, List)}}
* {{getPartitionsByNames(GetPartitionsByNamesRequest)}}

And btw we also use the thrift interface directly:
* {{get_partitions_by_names}}
* {{get_partitions_by_names_req}}


> Cleanup IMetaStoreClient#getPartitionsByNames APIs
> --
>
> Key: HIVE-25935
> URL: https://issues.apache.org/jira/browse/HIVE-25935
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Stamatis Zampetakis
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> Currently the 
> [IMetastoreClient|https://github.com/apache/hive/blob/4b7a948e45fd88372fef573be321cda40d189cc7/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java]
>  interface has 8 variants of the {{getPartitionsByNames}} method. Going 
> quickly over the concrete implementation it appears that not all of them are 
> useful/necessary so a bit of cleanup is needed.
> Below a few potential problems I observed:
> * Some of the APIs are not used anywhere in the project (neither by 
> production nor by test code).
> * Some of the APIs are deprecated in some concrete implementations but not 
> globally at the interface level without an explanation why.
> * Some of the implementations simply throw without doing anything.
> * Many of the APIs are partially tested or not tested at all.
> HIVE-24743, HIVE-25281 are related since they introduce/deprecate some of the 
> aforementioned APIs.
> It would be good to review the aforementioned APIs and decide what needs to 
> stay and what needs to go as well as complete necessary when relevant.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25495) Upgrade to JLine3

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25495?focusedWorklogId=735322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735322
 ]

ASF GitHub Bot logged work on HIVE-25495:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 14:17
Start Date: 02/Mar/22 14:17
Worklog Time Spent: 10m 
  Work Description: LA-Toth edited a comment on pull request #3069:
URL: https://github.com/apache/hive/pull/3069#issuecomment-1056978303


   I created it partially based on[ pull request 
2617](https://github.com/apache/hive/pull/2617).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735322)
Time Spent: 2.5h  (was: 2h 20m)

> Upgrade to JLine3
> -
>
> Key: HIVE-25495
> URL: https://issues.apache.org/jira/browse/HIVE-25495
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Jline 2 has been discontinued a long while ago.  Hadoop uses JLine3 so Hive 
> should match.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25495) Upgrade to JLine3

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25495?focusedWorklogId=735321=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735321
 ]

ASF GitHub Bot logged work on HIVE-25495:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 14:16
Start Date: 02/Mar/22 14:16
Worklog Time Spent: 10m 
  Work Description: LA-Toth commented on pull request #3069:
URL: https://github.com/apache/hive/pull/3069#issuecomment-1056978303


   I created it partially based on pull request 2617.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735321)
Time Spent: 2h 20m  (was: 2h 10m)

> Upgrade to JLine3
> -
>
> Key: HIVE-25495
> URL: https://issues.apache.org/jira/browse/HIVE-25495
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Jline 2 has been discontinued a long while ago.  Hadoop uses JLine3 so Hive 
> should match.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25495) Upgrade to JLine3

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25495?focusedWorklogId=735320=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735320
 ]

ASF GitHub Bot logged work on HIVE-25495:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 14:15
Start Date: 02/Mar/22 14:15
Worklog Time Spent: 10m 
  Work Description: LA-Toth opened a new pull request #3069:
URL: https://github.com/apache/hive/pull/3069


   ### What changes were proposed in this pull request?
   
   - Update jline to 3.21 - the behaviour is quite different. So a modified 
version of the JLine2's StringCompleter is added in hive-common.
   - Update sqlline to 1.12 as it uses jline 3.21
   - Add some missing dependencies to pom.xml files (the lack of these didn't 
cause issue possibly due to transitive dependency)
   
   
   ### Why are the changes needed?
   It's a step required to use JDK 17
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   By existing tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735320)
Time Spent: 2h 10m  (was: 2h)

> Upgrade to JLine3
> -
>
> Key: HIVE-25495
> URL: https://issues.apache.org/jira/browse/HIVE-25495
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Jline 2 has been discontinued a long while ago.  Hadoop uses JLine3 so Hive 
> should match.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25975) Optimize ClusteredWriter for bucketed Iceberg tables

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25975?focusedWorklogId=735276=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735276
 ]

ASF GitHub Bot logged work on HIVE-25975:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 12:59
Start Date: 02/Mar/22 12:59
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #3060:
URL: https://github.com/apache/hive/pull/3060#discussion_r817666327



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/GenericUDFIcebergBucket.java
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.nio.ByteBuffer;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableConstantIntObjectInspector;
+import org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.DoubleWritable;
+import org.apache.hadoop.io.FloatWritable;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.iceberg.transforms.Transform;
+import org.apache.iceberg.transforms.Transforms;
+import org.apache.iceberg.types.Type;
+import org.apache.iceberg.types.Types;
+
+/**
+ * GenericUDFIcebergBucket - UDF that wraps around Iceberg's bucket transform 
function
+ */
+@Description(name = "iceberg_bucket",
+value = "_FUNC_(value, bucketCount) - " +
+"Returns the bucket value calculated by Iceberg bucket transform 
function ",
+extended = "Example:\n  > SELECT _FUNC_('A bucket full of ice!', 5);\n  4")
+//@VectorizedExpressions({StringLength.class})
+public class GenericUDFIcebergBucket extends GenericUDF {
+  private final IntWritable result = new IntWritable();
+  private transient PrimitiveObjectInspector argumentOI;
+  private transient PrimitiveObjectInspectorConverter.StringConverter 
stringConverter;
+  private transient PrimitiveObjectInspectorConverter.BinaryConverter 
binaryConverter;
+  private transient PrimitiveObjectInspectorConverter.IntConverter 
intConverter;
+  private transient PrimitiveObjectInspectorConverter.LongConverter 
longConverter;
+  private transient PrimitiveObjectInspectorConverter.HiveDecimalConverter 
decimalConverter;
+  private transient PrimitiveObjectInspectorConverter.FloatConverter 
floatConverter;
+  private transient PrimitiveObjectInspectorConverter.DoubleConverter 
doubleConverter;
+  private transient Type.PrimitiveType icebergType;
+  private int numBuckets = -1;
+
+  @Override
+  public ObjectInspector initialize(ObjectInspector[] arguments) throws 
UDFArgumentException {
+if (arguments.length != 2) {
+  throw new UDFArgumentLengthException(
+  "ICEBERG_BUCKET requires 2 argument, got " + arguments.length);
+}
+
+if (arguments[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
+  throw new UDFArgumentException(
+  "ICEBERG_BUCKET first argument takes primitive types, got " + 
argumentOI.getTypeName());
+}
+argumentOI = (PrimitiveObjectInspector) arguments[0];
+
+PrimitiveObjectInspector.PrimitiveCategory inputType = 
argumentOI.getPrimitiveCategory();
+ObjectInspector outputOI = null;
+switch (inputType) {
+  case CHAR:
+ 

[jira] [Work logged] (HIVE-25975) Optimize ClusteredWriter for bucketed Iceberg tables

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25975?focusedWorklogId=735272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735272
 ]

ASF GitHub Bot logged work on HIVE-25975:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 12:42
Start Date: 02/Mar/22 12:42
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #3060:
URL: https://github.com/apache/hive/pull/3060#discussion_r817654079



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/GenericUDFIcebergBucket.java
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.nio.ByteBuffer;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableConstantIntObjectInspector;
+import org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.DoubleWritable;
+import org.apache.hadoop.io.FloatWritable;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.iceberg.transforms.Transform;
+import org.apache.iceberg.transforms.Transforms;
+import org.apache.iceberg.types.Type;
+import org.apache.iceberg.types.Types;
+
+/**
+ * GenericUDFIcebergBucket - UDF that wraps around Iceberg's bucket transform 
function
+ */
+@Description(name = "iceberg_bucket",
+value = "_FUNC_(value, bucketCount) - " +
+"Returns the bucket value calculated by Iceberg bucket transform 
function ",
+extended = "Example:\n  > SELECT _FUNC_('A bucket full of ice!', 5);\n  4")
+//@VectorizedExpressions({StringLength.class})
+public class GenericUDFIcebergBucket extends GenericUDF {
+  private final IntWritable result = new IntWritable();
+  private transient PrimitiveObjectInspector argumentOI;
+  private transient PrimitiveObjectInspectorConverter.StringConverter 
stringConverter;
+  private transient PrimitiveObjectInspectorConverter.BinaryConverter 
binaryConverter;
+  private transient PrimitiveObjectInspectorConverter.IntConverter 
intConverter;
+  private transient PrimitiveObjectInspectorConverter.LongConverter 
longConverter;
+  private transient PrimitiveObjectInspectorConverter.HiveDecimalConverter 
decimalConverter;
+  private transient PrimitiveObjectInspectorConverter.FloatConverter 
floatConverter;
+  private transient PrimitiveObjectInspectorConverter.DoubleConverter 
doubleConverter;
+  private transient Type.PrimitiveType icebergType;
+  private int numBuckets = -1;
+
+  @Override
+  public ObjectInspector initialize(ObjectInspector[] arguments) throws 
UDFArgumentException {
+if (arguments.length != 2) {
+  throw new UDFArgumentLengthException(
+  "ICEBERG_BUCKET requires 2 argument, got " + arguments.length);
+}
+
+if (arguments[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
+  throw new UDFArgumentException(
+  "ICEBERG_BUCKET first argument takes primitive types, got " + 
argumentOI.getTypeName());
+}
+argumentOI = (PrimitiveObjectInspector) arguments[0];
+
+PrimitiveObjectInspector.PrimitiveCategory inputType = 
argumentOI.getPrimitiveCategory();
+ObjectInspector outputOI = null;
+switch (inputType) {
+  case CHAR:
+ 

[jira] [Commented] (HIVE-25935) Cleanup IMetaStoreClient#getPartitionsByNames APIs

2022-03-02 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500099#comment-17500099
 ] 

Peter Vary commented on HIVE-25935:
---

So the HMS API contains 2 methods:
{code:java}
// get partitions give a list of partition names
list get_partitions_by_names(1:string db_name 2:string tbl_name 
3:list names)
 throws(1:MetaException o1, 2:NoSuchObjectException o2)
GetPartitionsByNamesResult 
get_partitions_by_names_req(1:GetPartitionsByNamesRequest req)
  throws(1:MetaException o1, 2:NoSuchObjectException o2) 
{code}
The first one is an old version (2011)

The second one is a new version (2019)

We definitely need both.

So the question is mainly the HiveMetaStoreClient methods.

> Cleanup IMetaStoreClient#getPartitionsByNames APIs
> --
>
> Key: HIVE-25935
> URL: https://issues.apache.org/jira/browse/HIVE-25935
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Stamatis Zampetakis
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> Currently the 
> [IMetastoreClient|https://github.com/apache/hive/blob/4b7a948e45fd88372fef573be321cda40d189cc7/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java]
>  interface has 8 variants of the {{getPartitionsByNames}} method. Going 
> quickly over the concrete implementation it appears that not all of them are 
> useful/necessary so a bit of cleanup is needed.
> Below a few potential problems I observed:
> * Some of the APIs are not used anywhere in the project (neither by 
> production nor by test code).
> * Some of the APIs are deprecated in some concrete implementations but not 
> globally at the interface level without an explanation why.
> * Some of the implementations simply throw without doing anything.
> * Many of the APIs are partially tested or not tested at all.
> HIVE-24743, HIVE-25281 are related since they introduce/deprecate some of the 
> aforementioned APIs.
> It would be good to review the aforementioned APIs and decide what needs to 
> stay and what needs to go as well as complete necessary when relevant.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25935) Cleanup IMetaStoreClient#getPartitionsByNames APIs

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-25935:
-

Assignee: Peter Vary

> Cleanup IMetaStoreClient#getPartitionsByNames APIs
> --
>
> Key: HIVE-25935
> URL: https://issues.apache.org/jira/browse/HIVE-25935
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Stamatis Zampetakis
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> Currently the 
> [IMetastoreClient|https://github.com/apache/hive/blob/4b7a948e45fd88372fef573be321cda40d189cc7/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java]
>  interface has 8 variants of the {{getPartitionsByNames}} method. Going 
> quickly over the concrete implementation it appears that not all of them are 
> useful/necessary so a bit of cleanup is needed.
> Below a few potential problems I observed:
> * Some of the APIs are not used anywhere in the project (neither by 
> production nor by test code).
> * Some of the APIs are deprecated in some concrete implementations but not 
> globally at the interface level without an explanation why.
> * Some of the implementations simply throw without doing anything.
> * Many of the APIs are partially tested or not tested at all.
> HIVE-24743, HIVE-25281 are related since they introduce/deprecate some of the 
> aforementioned APIs.
> It would be good to review the aforementioned APIs and decide what needs to 
> stay and what needs to go as well as complete necessary when relevant.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25997) Build from source distribution archive fails

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?focusedWorklogId=735244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735244
 ]

ASF GitHub Bot logged work on HIVE-25997:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 11:46
Start Date: 02/Mar/22 11:46
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #3067:
URL: https://github.com/apache/hive/pull/3067#issuecomment-1056838849


   @kgyrtkirk: You wrote:
   > can we add the unpack and build commands to the jenkinsfile - so that we 
will be checking that the src archive is usable as part of the CI builds?
   
   Shall we do this in this jira?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735244)
Time Spent: 1h  (was: 50m)

> Build from source distribution archive fails
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests -Piceberg{code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25935) Cleanup IMetaStoreClient#getPartitionsByNames APIs

2022-03-02 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500078#comment-17500078
 ] 

Peter Vary commented on HIVE-25935:
---

[~boroknagyz]: Can you please help us which getPartitionsByNames methods are 
used by Impala?

Thanks,

Peter

> Cleanup IMetaStoreClient#getPartitionsByNames APIs
> --
>
> Key: HIVE-25935
> URL: https://issues.apache.org/jira/browse/HIVE-25935
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Stamatis Zampetakis
>Priority: Major
> Fix For: 4.0.0-alpha-1
>
>
> Currently the 
> [IMetastoreClient|https://github.com/apache/hive/blob/4b7a948e45fd88372fef573be321cda40d189cc7/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java]
>  interface has 8 variants of the {{getPartitionsByNames}} method. Going 
> quickly over the concrete implementation it appears that not all of them are 
> useful/necessary so a bit of cleanup is needed.
> Below a few potential problems I observed:
> * Some of the APIs are not used anywhere in the project (neither by 
> production nor by test code).
> * Some of the APIs are deprecated in some concrete implementations but not 
> globally at the interface level without an explanation why.
> * Some of the implementations simply throw without doing anything.
> * Many of the APIs are partially tested or not tested at all.
> HIVE-24743, HIVE-25281 are related since they introduce/deprecate some of the 
> aforementioned APIs.
> It would be good to review the aforementioned APIs and decide what needs to 
> stay and what needs to go as well as complete necessary when relevant.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reopened HIVE-25998:
---

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500075#comment-17500075
 ] 

Peter Vary commented on HIVE-25998:
---

Ok. Removed the target, and reopened the ticket

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25998:
--
Target Version/s:   (was: 4.0.0-alpha-1)

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500058#comment-17500058
 ] 

Stamatis Zampetakis commented on HIVE-25998:


Won't fix is a bit misleading. We could simply keep the ticket open till the 
right time comes to remove the flag.

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25997) Build from source distribution archive fails

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?focusedWorklogId=735229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735229
 ]

ASF GitHub Bot logged work on HIVE-25997:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 11:07
Start Date: 02/Mar/22 11:07
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #3067:
URL: https://github.com/apache/hive/pull/3067#discussion_r817585171



##
File path: packaging/src/main/assembly/src.xml
##
@@ -100,12 +100,26 @@
 storage-api/**/*
 standalone-metastore/metastore-common/**/*
 standalone-metastore/metastore-server/**/*
+standalone-metastore/metastore-tools/**/*
+standalone-metastore/src/assembly/src.xml
+standalone-metastore/pom.xml

Review comment:
   I don't know the full story behind these "independent" releases. If we 
want from now on to start releasing everything together that's fine with me. 
This could even help sanitize a bit the maven structure which is a bit messy at 
the moment. I just wanted to bring this up and be sure that what are doing is 
intentional.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735229)
Time Spent: 50m  (was: 40m)

> Build from source distribution archive fails
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests -Piceberg{code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25997) Build from source distribution archive fails

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?focusedWorklogId=735217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735217
 ]

ASF GitHub Bot logged work on HIVE-25997:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 10:48
Start Date: 02/Mar/22 10:48
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #3067:
URL: https://github.com/apache/hive/pull/3067#discussion_r817570663



##
File path: packaging/src/main/assembly/src.xml
##
@@ -100,12 +100,26 @@
 storage-api/**/*
 standalone-metastore/metastore-common/**/*
 standalone-metastore/metastore-server/**/*
+standalone-metastore/metastore-tools/**/*
+standalone-metastore/src/assembly/src.xml
+standalone-metastore/pom.xml

Review comment:
   these modules are not fully separated - meaning they are connected to 
the root pom.xml;
   if we really want to separate them - we can put them into a new project...
   
   but honestly I think these things were released separetly because there were 
no regular Hive releases




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735217)
Time Spent: 40m  (was: 0.5h)

> Build from source distribution archive fails
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests -Piceberg{code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25997) Build from source distribution archive fails

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?focusedWorklogId=735209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735209
 ]

ASF GitHub Bot logged work on HIVE-25997:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 10:37
Start Date: 02/Mar/22 10:37
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #3067:
URL: https://github.com/apache/hive/pull/3067#discussion_r817557491



##
File path: packaging/src/main/assembly/src.xml
##
@@ -100,12 +100,26 @@
 storage-api/**/*

Review comment:
   This is unrelated to the changes here but it is kind of weird that we 
are packing in the source packages of Hive a project/module (`storage-api`) 
that has it's own release cycle. If we are including the sources here then we 
are somewhat releasing the storage-api multiple times.

##
File path: packaging/src/main/assembly/src.xml
##
@@ -100,12 +100,26 @@
 storage-api/**/*
 standalone-metastore/metastore-common/**/*
 standalone-metastore/metastore-server/**/*
+standalone-metastore/metastore-tools/**/*
+standalone-metastore/src/assembly/src.xml
+standalone-metastore/pom.xml

Review comment:
   Honestly, I am not sure what is the proper solution here. The 
`standalone-metastore` was separated in HIVE-17159 so that it gets its own 
release life-cycle. Since 2017, though I don't think this ever happened in 
practice. My concern here is the same with `storage-api`. We shouldn't be 
releasing the same code twice.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735209)
Time Spent: 0.5h  (was: 20m)

> Build from source distribution archive fails
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests -Piceberg{code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-25998.
---
Resolution: Won't Fix

We won't change this for now, as the {{iceberg-shading}} modules is needed, and 
if we enable it then the developers using IntelliJ will have to manually change 
the dependency from {{module}} to {{{}library{}}}, and it will disrupt their 
work.

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25997) Build from source distribution archive fails

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?focusedWorklogId=735188=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735188
 ]

ASF GitHub Bot logged work on HIVE-25997:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 09:42
Start Date: 02/Mar/22 09:42
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #3067:
URL: https://github.com/apache/hive/pull/3067#issuecomment-1056697160


   can we add the unpack and build commands to the jenkinsfile - so that we 
will be checking that the src archive is usable as part of the CI builds?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735188)
Time Spent: 20m  (was: 10m)

> Build from source distribution archive fails
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests -Piceberg{code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25999) INSERT OVERWRITE on partitioned table fails when the number of columns in the table is >= 225

2022-03-02 Thread Amrithaa R G (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrithaa R G updated HIVE-25999:

Description: 
 

==
Issue description:
==

The hive query "insert overwrite table ..." fails when the number of columns in 
a partitioned table is 225 or more .

To repro the issue, after the table creation, some data needs to be loaded into 
the temp table. Post this "insert overwrite" command on a partitioned target 
table fails with the below error. 

PLEASE NOTE: The error will not be reproducible if no data is loaded into the 
temp table 

Steps to repro is given below.

==
Error:
==
ERROR : FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask. Exception when loading 1 in table 
tgt_manycol with 
loadPath=hdfs://:8020/user/hive/warehouse/tgt_manycol/.hive-staging_hive_2022-02-24_12-13-54_035_1332847469986875236-5/-ext-1
INFO : Completed executing 
command(queryId=hive_20220224121354_c5509257-7d9c-462c-9089-282c758e2b2f); Time 
taken: 37.457 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.MoveTask. Exception when loading 1 in table 
tgt_manycol with 
loadPath=hdfs://:8020/user/hive/warehouse/tgt_manycol/.hive-staging_hive_2022-02-24_12-13-54_035_1332847469986875236-5/-ext-1
 (state=08S01,code=1)

===
Repro Steps:
===
Run the following queries in the given order to repro the error: 

1) CREATE TABLE tgt_ManyCol_temp (string1 string,int1 int,decimal1 
decimal(15,8),double1 double,date1 date,string_out_1 string,string2 string,int2 
int,decimal2 decimal(15,8),double2 double,date2 date,string_out_2 
string,string3 string,int3 int,decimal3 decimal(15,8),double3 double,date3 
date,string_out_3 string,string4 string,int4 int,decimal4 decimal(15,8),double4 
double,date4 date,string_out_4 string,string5 string,int5 int,decimal5 
decimal,double5 double,date5 date,string_out_5 string,string6 string,int6 
int,decimal6 decimal(15,8),double6 double,date6 date,string_out_6 
string,string7 string,int7 int,decimal7 decimal(15,8),double7 double,date7 
date,string_out_7 string,string8 string,int8 int,decimal8 decimal(15,8),double8 
double,date8 date,string_out_8 string,string9 string,int9 int,decimal9 
decimal(15,8),double9 double,date9 date,string_out_9 string,string10 
string,int10 int,decimal10 decimal(15,8),double10 double,date10 
date,string_out_10 string,string11 string,int11 int,decimal11 
decimal(15,8),double11 double,date11 date,string_out_11 string,string12 
string,int12 int,decimal12 decimal(15,8),double12 double,date12 
date,string_out_12 string,string13 string,int13 int,decimal13 
decimal(15,8),double13 double,date13 date,string_out_13 string,string14 
string,int14 int,decimal14 decimal(15,8),double14 double,date14 
date,string_out_14 string,string15 string,int15 int,decimal15 
decimal(15,8),double15 double,date15 date,string_out_15 string,string16 
string,int16 int,decimal16 decimal(15,8),double16 double,date16 
date,string_out_16 string,string17 string,int17 int,decimal17 
decimal(15,8),double17 double,date17 date,string_out_17 string,string18 
string,int18 int,decimal18 decimal(15,8),double18 double,date18 
date,string_out_18 string,string19 string,int19 int,decimal19 
decimal(15,8),double19 double,date19 date,string_out_19 string,string20 
string,int20 int,decimal20 decimal(15,8),double20 double,date20 
date,string_out_20 string,string21 string,int21 int,decimal21 
decimal(15,8),double21 double,date21 date,string_out_21 string,string22 
string,int22 int,decimal22 decimal(15,8),double22 double,date22 
date,string_out_22 string,string23 string,int23 int,decimal23 decimal,double23 
double,date23 date,string_out_23 string,string24 string,int24 int,decimal24 
decimal(15,8),double24 double,date24 date,string_out_24 string,string25 
string,int25 int,decimal25 decimal(15,8),double25 double,date25 
date,string_out_25 string,string26 string,int26 int,decimal26 
decimal(15,8),double26 double,date26 date,string_out_26 string,string27 
string,int27 int,decimal27 decimal(15,8),double27 double,date27 
date,string_out_27 string,string28 string,int28 int,decimal28 
decimal(15,8),double28 double,date28 date,string_out_28 string,string29 
string,int29 int,decimal29 decimal(15,8),double29 double,date29 
date,string_out_29 string,string30 string,int30 int,decimal30 
decimal(15,8),double30 double,date30 date,string_out_30 string,string31 
string,int31 int,decimal31 decimal(15,8),double31 double,date31 
date,string_out_31 string,string32 string,int32 int,decimal32 
decimal(15,8),double32 

[jira] [Updated] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25998:
--
Labels: pull-request-available  (was: )

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25998?focusedWorklogId=735182=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735182
 ]

ASF GitHub Bot logged work on HIVE-25998:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 09:00
Start Date: 02/Mar/22 09:00
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #3068:
URL: https://github.com/apache/hive/pull/3068


   ### What changes were proposed in this pull request?
   Remove the need for `-Piceberg` flag when compiling Iceberg
   
   ### Why are the changes needed?
   Iceberg has matured a lot. Hopefully it will not create issues from now on.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manually


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735182)
Remaining Estimate: 0h
Time Spent: 10m

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25894) Table migration to Iceberg doesn't remove HMS partitions

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25894?focusedWorklogId=735180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735180
 ]

ASF GitHub Bot logged work on HIVE-25894:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 08:47
Start Date: 02/Mar/22 08:47
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #3061:
URL: https://github.com/apache/hive/pull/3061#discussion_r817467883



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -3939,8 +3939,8 @@ protected boolean getPartitionsByExprInternal(String 
catName, String dbName, Str
   boolean allowSql, boolean allowJdo) throws TException {
 assert result != null;
 
-final ExpressionTree exprTree = 
PartFilterExprUtil.makeExpressionTree(expressionProxy, expr,
-
getDefaultPartitionName(defaultPartitionName), conf);
+final ExpressionTree exprTree = expr.length != 0 ? 
PartFilterExprUtil.makeExpressionTree(

Review comment:
   The zero length expression signifies that no filtering is needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735180)
Time Spent: 0.5h  (was: 20m)

> Table migration to Iceberg doesn't remove HMS partitions
> 
>
> Key: HIVE-25894
> URL: https://issues.apache.org/jira/browse/HIVE-25894
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Repro:
> {code:java}
> create table ice_part_migrate (i int) partitioned by (p int) stored as 
> parquet;
> insert into ice_part_migrate partition(p=1) values (1), (11), (111);
> insert into ice_part_migrate partition(p=2) values (2), (22), (222);
> ALTER TABLE ice_part_migrate  SET TBLPROPERTIES 
> ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler');
> {code}
> Then looking at the HMS database:
> {code:java}
> => select "PART_NAME" from "PARTITIONS" p, "TBLS" t where 
> t."TBL_ID"=p."TBL_ID" and t."TBL_NAME"='ice_part_migrate';
>  PART_NAME
> ---
>  p=1
>  p=2
> {code}
> This is weird because Iceberg tables are supposed to be unpartitioned. It 
> also breaks some precondition checks in Impala. Is there a particular reason 
> to keep the partitions in HMS?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25894) Table migration to Iceberg doesn't remove HMS partitions

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25894?focusedWorklogId=735179=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735179
 ]

ASF GitHub Bot logged work on HIVE-25894:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 08:45
Start Date: 02/Mar/22 08:45
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #3061:
URL: https://github.com/apache/hive/pull/3061#discussion_r817466517



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -3939,8 +3939,8 @@ protected boolean getPartitionsByExprInternal(String 
catName, String dbName, Str
   boolean allowSql, boolean allowJdo) throws TException {
 assert result != null;
 
-final ExpressionTree exprTree = 
PartFilterExprUtil.makeExpressionTree(expressionProxy, expr,
-
getDefaultPartitionName(defaultPartitionName), conf);
+final ExpressionTree exprTree = expr.length != 0 ? 
PartFilterExprUtil.makeExpressionTree(

Review comment:
   Can you explain why this change was needed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735179)
Time Spent: 20m  (was: 10m)

> Table migration to Iceberg doesn't remove HMS partitions
> 
>
> Key: HIVE-25894
> URL: https://issues.apache.org/jira/browse/HIVE-25894
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Repro:
> {code:java}
> create table ice_part_migrate (i int) partitioned by (p int) stored as 
> parquet;
> insert into ice_part_migrate partition(p=1) values (1), (11), (111);
> insert into ice_part_migrate partition(p=2) values (2), (22), (222);
> ALTER TABLE ice_part_migrate  SET TBLPROPERTIES 
> ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler');
> {code}
> Then looking at the HMS database:
> {code:java}
> => select "PART_NAME" from "PARTITIONS" p, "TBLS" t where 
> t."TBL_ID"=p."TBL_ID" and t."TBL_NAME"='ice_part_migrate';
>  PART_NAME
> ---
>  p=1
>  p=2
> {code}
> This is weird because Iceberg tables are supposed to be unpartitioned. It 
> also breaks some precondition checks in Impala. Is there a particular reason 
> to keep the partitions in HMS?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25998) Build iceberg modules without a flag

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-25998:
-


> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25997) Build from source distribution archive fails

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25997:
--
Summary: Build from source distribution archive fails  (was: Fix release 
source packaging)

> Build from source distribution archive fails
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests {code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25997) Build from source distribution archive fails

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25997:
--
Description: 
The generated source package is not compiling with:
{code:java}
mvn clean install -DskipTests -Piceberg{code}
We should fix that for the release

  was:
The generated source package is not compiling with:
{code:java}
mvn clean install -DskipTests {code}
We should fix that for the release


> Build from source distribution archive fails
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests -Piceberg{code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25665) Checkstyle LGPL files must not be in the release sources/binaries

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25665:
--
Target Version/s: 4.0.0-alpha-1  (was: 4.0.0)

> Checkstyle LGPL files must not be in the release sources/binaries
> -
>
> Key: HIVE-25665
> URL: https://issues.apache.org/jira/browse/HIVE-25665
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Affects Versions: 0.6.0
>Reporter: Stamatis Zampetakis
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As discussed in the [dev 
> list|https://lists.apache.org/thread/r13e3236aa72a070b3267ed95f7cb3b45d3c4783fd4ca35f5376b1a35@%3cdev.hive.apache.org%3e]
>  LGPL files must not be present in the Apache released sources/binaries.
> The following files must not be present in the release:
> https://github.com/apache/hive/blob/6e152aa28bc5116bf9210f9deb0f95d2d73183f7/checkstyle/checkstyle-noframes-sorted.xsl
> https://github.com/apache/hive/blob/6e152aa28bc5116bf9210f9deb0f95d2d73183f7/storage-api/checkstyle/checkstyle-noframes-sorted.xsl
> https://github.com/apache/hive/blob/6e152aa28bc5116bf9210f9deb0f95d2d73183f7/standalone-metastore/checkstyle/checkstyle-noframes-sorted.xsl
> There may be other checkstyle LGPL files in the repo. All these should either 
> be removed entirely from the repository or selectively excluded from the 
> release.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25997) Fix release source packaging

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25997:
--
Component/s: Build Infrastructure

> Fix release source packaging
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests {code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25995) Build from source distribution archive fails

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-25995.
---
Fix Version/s: 4.0.0-alpha-1
   Resolution: Fixed

> Build from source distribution archive fails
> 
>
> Key: HIVE-25995
> URL: https://issues.apache.org/jira/browse/HIVE-25995
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Stamatis Zampetakis
>Priority: Blocker
> Fix For: 4.0.0-alpha-1
>
>
> The source distribution archive, apache-hive-4.0.0-SNAPSHOT-src.tar.gz, can 
> be produced by running:
> {code:bash}
> mvn clean package -DskipTests -Pdist
> {code}
> The file is generated under:
> {noformat}
> packaging/target/apache-hive-4.0.0-SNAPSHOT-src.tar.gz
> {noformat}
> The source distribution archive/package 
> [should|https://www.apache.org/legal/release-policy.html#source-packages] 
> allow anyone who downloads it to build and test Hive.
> At the moment, on commit 
> [b63dab11d229abac59a4ef5e141d8d9b28037c8b|https://github.com/apache/hive/commit/b63dab11d229abac59a4ef5e141d8d9b28037c8b],
>  if someone produces the source package and extracts the contents of the 
> archive, it is not possible to build Hive.
> Both {{mvn install}} and {{mvn package}} commands fail when they are executed 
> inside the directory extracted from the archive.
> {noformat}
> mvn clean install -DskipTests
> mvn clean package -DskipTests
> {noformat}
> The error is shown below:
> {noformat}
> [INFO] Scanning for projects...
> [ERROR] [ERROR] Some problems were encountered while processing the POMs:
> [ERROR] Child module 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/parser of 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/pom.xml does not 
> exist @ 
> [ERROR] Child module 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/udf of 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/pom.xml does not 
> exist @ 
> [ERROR] Child module 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/standalone-metastore/pom.xml
>  of /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/pom.xml does not 
> exist @ 
>  @ 
> [ERROR] The build could not read 1 project -> [Help 1]
> [ERROR]   
> [ERROR]   The project org.apache.hive:hive:4.0.0-SNAPSHOT 
> (/home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/pom.xml) has 3 errors
> [ERROR] Child module 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/parser of 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/pom.xml does not exist
> [ERROR] Child module 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/udf of 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/pom.xml does not exist
> [ERROR] Child module 
> /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/standalone-metastore/pom.xml
>  of /home/stamatis/Downloads/apache-hive-4.0.0-SNAPSHOT-src/pom.xml does not 
> exist
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25997) Fix release source packaging

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25997:
--
Issue Type: Bug  (was: Improvement)

> Fix release source packaging
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests {code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25997) Fix release source packaging

2022-03-02 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25997:
--
Priority: Blocker  (was: Major)

> Fix release source packaging
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests {code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25997) Fix release source packaging

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?focusedWorklogId=735173=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735173
 ]

ASF GitHub Bot logged work on HIVE-25997:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 08:22
Start Date: 02/Mar/22 08:22
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #3067:
URL: https://github.com/apache/hive/pull/3067


   ### What changes were proposed in this pull request?
   Changes in the packaging files for the src tar.gz
   
   ### Why are the changes needed?
   We need the src package to be buildable
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manually:
   ```
   tar -xvzf apache-hive-4.0.0-SNAPSHOT-src.tar.gz
   cd apache-hive-4.0.0-SNAPSHOT-src
   mvn clean install -DskipTests -Piceberg
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735173)
Remaining Estimate: 0h
Time Spent: 10m

> Fix release source packaging
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests {code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25997) Fix release source packaging

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25997:
--
Labels: pull-request-available  (was: )

> Fix release source packaging
> 
>
> Key: HIVE-25997
> URL: https://issues.apache.org/jira/browse/HIVE-25997
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0-alpha-1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The generated source package is not compiling with:
> {code:java}
> mvn clean install -DskipTests {code}
> We should fix that for the release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25750) Beeline: Creating a standalone tarball by isolating dependencies

2022-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25750?focusedWorklogId=735163=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-735163
 ]

ASF GitHub Bot logged work on HIVE-25750:
-

Author: ASF GitHub Bot
Created on: 02/Mar/22 07:56
Start Date: 02/Mar/22 07:56
Worklog Time Spent: 10m 
  Work Description: achennagiri commented on pull request #3043:
URL: https://github.com/apache/hive/pull/3043#issuecomment-1056501302


   The fix has been committed upstream.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 735163)
Time Spent: 3h 50m  (was: 3h 40m)

> Beeline: Creating a standalone tarball by isolating dependencies
> 
>
> Key: HIVE-25750
> URL: https://issues.apache.org/jira/browse/HIVE-25750
> Project: Hive
>  Issue Type: Bug
>Reporter: Abhay
>Assignee: Abhay
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The code to create a standalone beeline tarball was created as part of this 
> ticket https://issues.apache.org/jira/browse/HIVE-24348. However, a bug was 
> reported in the case when the beeline is tried to install without the hadoop 
> installed. 
> The beeline script complains of missing dependencies when it is run.
> The ask as part of this ticket is to fix that bug. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)