[jira] [Updated] (HIVE-27996) Revert HIVE-27406 & HIVE-27481

2024-01-22 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-27996:
--
Priority: Minor  (was: Blocker)

> Revert HIVE-27406 & HIVE-27481
> --
>
> Key: HIVE-27996
> URL: https://issues.apache.org/jira/browse/HIVE-27996
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0-beta-1
>Reporter: László Végh
>Assignee: Laszlo Vegh
>Priority: Minor
>  Labels: pull-request-available
>
> Revert HIVE-27406 & HIVE-27481
>  
> The introduced changes were causing DB incompatibility issues.
> {code}
> create table if not exists tab_acid (a int) partitioned by (p string) stored 
> as orc TBLPROPERTIES ('transactional'='true');
> insert into tab_acid values(1,'foo'),(3,'bar');
> Caused by: MetaException(message:The update count was rejected in at least 
> one of the result array. Rolling back.)
>   at 
> org.apache.hadoop.hive.metastore.txn.jdbc.MultiDataSourceJdbcResource.execute(MultiDataSourceJdbcResource.java:217)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.addDynamicPartitions(TxnHandler.java:876)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-28010) Using apache fury instead of kyro/protubuf

2024-01-22 Thread yongzhi.shao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809768#comment-17809768
 ] 

yongzhi.shao edited comment on HIVE-28010 at 1/23/24 7:18 AM:
--

[~zhangbutao] :
 # For serializing/deserializing Hive records, Hive has its own custom serde. 
mabye we can using fury.
 # Tez shuffle using protobuf, we can using fury too.
 # Kryo is used to serializing/deserializing Hive operators. This can be 
replaced by Fury

Since fury's performance is currently much higher than protobuf/kyro, we can 
fully expect a performance boost from fury.

 

For Example:

[[Flink] Optimize CDC sink serde with Fury by xuchen-plus · Pull Request #307 · 
lakesoul-io/LakeSoul|https://github.com/lakesoul-io/LakeSoul/pull/307]


was (Author: lisoda):
[~zhangbutao] :
 # For serializing/deserializing Hive records, Hive has its own custom serde. 
mabye we can using fury.
 # Tez shuffle using protobuf, we can using fury too.
 # Kryo is used to serializing/deserializing Hive operators. This can be 
replaced by Fury

Since fury's performance is currently much higher than protobuf/kyro, we can 
fully expect a performance boost from fury.

> Using apache fury instead of kyro/protubuf
> --
>
> Key: HIVE-28010
> URL: https://issues.apache.org/jira/browse/HIVE-28010
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: yongzhi.shao
>Priority: Minor
>
> APACHE FURY is a new serialisation framework that can significantly improve 
> serialisation/deserialisation performance compared to Kyro and Protobuf. Do 
> we need Fury in HIVE?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-28010) Using apache fury instead of kyro/protubuf

2024-01-22 Thread yongzhi.shao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809768#comment-17809768
 ] 

yongzhi.shao edited comment on HIVE-28010 at 1/23/24 7:15 AM:
--

[~zhangbutao] :
 # For serializing/deserializing Hive records, Hive has its own custom serde. 
mabye we can using fury.
 # Tez shuffle using protobuf, we can using fury too.
 # Kryo is used to serializing/deserializing Hive operators. This can be 
replaced by Fury

Since fury's performance is currently much higher than protobuf/kyro, we can 
fully expect a performance boost from fury.


was (Author: lisoda):
[~zhangbutao] :
 # For serializing/deserializing Hive records, Hive has its own custom serde. 
mabye we can using fury.
 # Tez shuffle using protobuf, we can using fury too.
 # Kryo is used to serializing/deserializing Hive operators. This can be 
replaced by Fury

> Using apache fury instead of kyro/protubuf
> --
>
> Key: HIVE-28010
> URL: https://issues.apache.org/jira/browse/HIVE-28010
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: yongzhi.shao
>Priority: Minor
>
> APACHE FURY is a new serialisation framework that can significantly improve 
> serialisation/deserialisation performance compared to Kyro and Protobuf. Do 
> we need Fury in HIVE?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28010) Using apache fury instead of kyro/protubuf

2024-01-22 Thread yongzhi.shao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809768#comment-17809768
 ] 

yongzhi.shao commented on HIVE-28010:
-

[~zhangbutao] :
 # For serializing/deserializing Hive records, Hive has its own custom serde. 
mabye we can using fury.
 # Tez shuffle using protobuf, we can using fury too.
 # Kryo is used to serializing/deserializing Hive operators. This can be 
replaced by Fury

> Using apache fury instead of kyro/protubuf
> --
>
> Key: HIVE-28010
> URL: https://issues.apache.org/jira/browse/HIVE-28010
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: yongzhi.shao
>Priority: Minor
>
> APACHE FURY is a new serialisation framework that can significantly improve 
> serialisation/deserialisation performance compared to Kyro and Protobuf. Do 
> we need Fury in HIVE?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28010) Using apache fury instead of kyro/protubuf

2024-01-22 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809766#comment-17809766
 ] 

Butao Zhang commented on HIVE-28010:


I am not familar with apache fury. looks interesting!

But i never saw the performance of serialisation/deserialisation(Kyro and 
Protobuf) is a bottleneck for hive, so i am not sure if it is deserved to try 
to introduce apache fury to hive.

Anyway, if some one tries to do some benchmark and then integrate apache fury, 
i think this will attract more hive dev attention.

> Using apache fury instead of kyro/protubuf
> --
>
> Key: HIVE-28010
> URL: https://issues.apache.org/jira/browse/HIVE-28010
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: yongzhi.shao
>Priority: Minor
>
> APACHE FURY is a new serialisation framework that can significantly improve 
> serialisation/deserialisation performance compared to Kyro and Protobuf. Do 
> we need Fury in HIVE?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27751) Log Query Compilation summary in an accumulated way

2024-01-22 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809764#comment-17809764
 ] 

Ramesh Kumar Thangarajan commented on HIVE-27751:
-

Hi [~zabetak] 

Thank you very much for reviewing this. I have updated the description with the 
sample output. 

Usually the debug logs are all spread across multiple places and we do not have 
a easy way to get the details from user when they run into performance issues. 
As part of this PR, main idea is to output the information in the command line 
output too. This will be done only if the config is turned on. That is what I 
meant by accumulated as we get all the details related to Query Compilation at 
one single place and its visible to the user as part of the query output.

Also I have addressed your comments, can you let me know what you think about 
the latest patch?

> Log Query Compilation summary in an accumulated way
> ---
>
> Key: HIVE-27751
> URL: https://issues.apache.org/jira/browse/HIVE-27751
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Query Compilation summary is very useful for reading and collecting all the 
> measures of compile time in a single place. It is also useful in debugging a 
> performance issue in the query compilation phase and also to report and 
> compare with various runs
> In order to run test this. Please set the config hive.compile.print.summary 
> to true in any q file and run the test to see the Query Compilation Summary 
> in the logs. One example of the output is below. The order of operations are 
> maintained while print the summary too:
> {code:java}
> Query Compilation Summary
> --
> waitCompile   
>0 ms
> parse 
>4 ms
> getTableConstraints - HS2-cache   
>   69 ms
> optimizer - Calcite: Plan generation  
>  257 ms
> optimizer - Calcite: Prejoin ordering transformation  
>   20 ms
> optimizer - Calcite: Postjoin ordering transformation 
>   24 ms
> optimizer 
>  705 ms
> optimizer - HiveOpConverterPostProc   
>0 ms
> optimizer - Generator 
>   24 ms
> optimizer - PartitionColumnsSeparator 
>1 ms
> optimizer - SyntheticJoinPredicate
>2 ms
> optimizer - SimplePredicatePushDown   
>8 ms
> optimizer - RedundantDynamicPruningConditionsRemoval  
>0 ms
> optimizer - SortedDynPartitionTimeGranularityOptimizer
>2 ms
> optimizer - PartitionPruner   
>3 ms
> optimizer - PartitionConditionRemover 
>2 ms
> optimizer - GroupByOptimizer  
>2 ms
> optimizer - ColumnPruner  
>   10 ms
> optimizer - CountDistinctRewriteProc  
>1 ms
> optimizer - SamplePruner  
>1 ms
> optimizer - MapJoinProcessor  
>2 ms
> optimizer - BucketingSortingReduceSinkOptimizer   
>2 ms
> optimizer - UnionProcessor
>2 ms
> optimizer - JoinReorder   
>0 ms
> optimizer - FixedBucketPruningOptimizer   
>2 ms
> optimizer - 

[jira] [Updated] (HIVE-27751) Log Query Compilation summary in an accumulated way

2024-01-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-27751:

Description: 
Query Compilation summary is very useful for reading and collecting all the 
measures of compile time in a single place. It is also useful in debugging a 
performance issue in the query compilation phase and also to report and compare 
with various runs

In order to run test this. Please set the config hive.compile.print.summary to 
true in any q file and run the test to see the Query Compilation Summary in the 
logs. One example of the output is below. The order of operations are 
maintained while print the summary too:
{code:java}
Query Compilation Summary
--
waitCompile 
 0 ms
parse   
 4 ms
getTableConstraints - HS2-cache 
69 ms
optimizer - Calcite: Plan generation
   257 ms
optimizer - Calcite: Prejoin ordering transformation
20 ms
optimizer - Calcite: Postjoin ordering transformation   
24 ms
optimizer   
   705 ms
optimizer - HiveOpConverterPostProc 
 0 ms
optimizer - Generator   
24 ms
optimizer - PartitionColumnsSeparator   
 1 ms
optimizer - SyntheticJoinPredicate  
 2 ms
optimizer - SimplePredicatePushDown 
 8 ms
optimizer - RedundantDynamicPruningConditionsRemoval
 0 ms
optimizer - SortedDynPartitionTimeGranularityOptimizer  
 2 ms
optimizer - PartitionPruner 
 3 ms
optimizer - PartitionConditionRemover   
 2 ms
optimizer - GroupByOptimizer
 2 ms
optimizer - ColumnPruner
10 ms
optimizer - CountDistinctRewriteProc
 1 ms
optimizer - SamplePruner
 1 ms
optimizer - MapJoinProcessor
 2 ms
optimizer - BucketingSortingReduceSinkOptimizer 
 2 ms
optimizer - UnionProcessor  
 2 ms
optimizer - JoinReorder 
 0 ms
optimizer - FixedBucketPruningOptimizer 
 2 ms
optimizer - BucketVersionPopulator  
 2 ms
optimizer - NonBlockingOpDeDupProc  
 1 ms
optimizer - IdentityProjectRemover  
 0 ms
optimizer - LimitPushdownOptimizer  
 2 ms
optimizer - OrderlessLimitPushDownOptimizer 
 1 ms
optimizer - StatsOptimizer  
 0 ms
optimizer - SimpleFetchOptimizer
 0 ms
TezCompiler - Run top n key optimization
 2 ms
TezCompiler - Setup dynamic partition pruning   
 3 ms
optimizer - Merge single column semi-join reducers to composite 
 0 ms
partition-retrieving
 1 ms
TezCompiler - Setup stats in the operator plan  

[jira] [Commented] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-22 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809762#comment-17809762
 ] 

Butao Zhang commented on HIVE-28015:


Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{*ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id*-- single column}}

{{*ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data*-- multiple 
columns}}

 

Or using *primary key(i)* syntax like your example?

*create table ice_pk (i int, j int, primary key(i)) stored as iceberg;*

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code}
> create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-22 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809762#comment-17809762
 ] 

Butao Zhang edited comment on HIVE-28015 at 1/23/24 6:55 AM:
-

Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id{*}-- single column}}

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data{*}-- multiple 
columns}}

 

Or using *primary key(i)* syntax like your example?

*create table ice_pk (i int, j int, primary key(i)) stored as iceberg;*


was (Author: zhangbutao):
Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{*ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id*-- single column}}

{{*ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data*-- multiple 
columns}}

 

Or using *primary key(i)* syntax like your example?

*create table ice_pk (i int, j int, primary key(i)) stored as iceberg;*

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code}
> create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27751) Log Query Compilation summary in an accumulated way

2024-01-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-27751:

Description: 
Query Compilation summary is very useful for reading and collecting all the 
measures of compile time in a single place. It is also useful in debugging a 
performance issue in the query compilation phase and also to report and compare 
with various runs

 

After the 

  was:Query Compilation summary is very useful for reading and collecting all 
the measures of compile time in a single place. It is also useful in debugging 
a performance issue in the query compilation phase and also to report and 
compare with various runs


> Log Query Compilation summary in an accumulated way
> ---
>
> Key: HIVE-27751
> URL: https://issues.apache.org/jira/browse/HIVE-27751
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Query Compilation summary is very useful for reading and collecting all the 
> measures of compile time in a single place. It is also useful in debugging a 
> performance issue in the query compilation phase and also to report and 
> compare with various runs
>  
> After the 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26713) StringExpr ArrayIndexOutOfBoundsException with LIKE '%xxx%'

2024-01-22 Thread Ryu Kobayashi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809758#comment-17809758
 ] 

Ryu Kobayashi commented on HIVE-26713:
--

Thanks [~zhangbutao] and [~aturoczy] .

> StringExpr ArrayIndexOutOfBoundsException with LIKE '%xxx%'
> ---
>
> Key: HIVE-26713
> URL: https://issues.apache.org/jira/browse/HIVE-26713
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Affects Versions: All Versions
>Reporter: Ryu Kobayashi
>Assignee: Ryu Kobayashi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When LIKE(%xxx%) search is performed, if the character string contains 
> control characters, overflow occurs as follows.
> https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringExpr.java#L345
> {code:java}
> // input[next] == -1
> // shift[input[next] & MAX_BYTE] == 255
> next += shift[input[next] & MAX_BYTE]; {code}
>  
> Stack trace:
> {code:java}
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1665986828766_64791_1_00_00_3:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
> 2 at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:220)
> 3 at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:177)
> 4 at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:479)
> 5 at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> 6 at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> 7 at java.security.AccessController.doPrivileged(Native Method)
> 8 at javax.security.auth.Subject.doAs(Subject.java:422)
> 9 at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
> 10at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> 11at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> 12at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> 13at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> 14at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> 15at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> 16at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 17at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 18at java.lang.Thread.run(Thread.java:750)
> 19Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> 20at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
> 21at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
> 22at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> 23at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:194)
> 24... 16 more
> 25Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row 
> 26at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:883)
> 27at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
> 28... 19 more
> 29Caused by: java.lang.ArrayIndexOutOfBoundsException: 255
> 30at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.StringExpr$BoyerMooreHorspool.find(StringExpr.java:409)
> 31at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.AbstractFilterStringColLikeStringScalar$MiddleChecker.index(AbstractFilterStringColLikeStringScalar.java:314)
> 32at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.AbstractFilterStringColLikeStringScalar$MiddleChecker.check(AbstractFilterStringColLikeStringScalar.java:307)
> 33at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.AbstractFilterStringColLikeStringScalar.evaluate(AbstractFilterStringColLikeStringScalar.java:115)
> 34at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprOrExpr.evaluate(FilterExprOrExpr.java:183)
> 35at 
> 

[jira] [Resolved] (HIVE-26713) StringExpr ArrayIndexOutOfBoundsException with LIKE '%xxx%'

2024-01-22 Thread Butao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Butao Zhang resolved HIVE-26713.

Fix Version/s: 4.1.0
   Resolution: Fixed

Fix has been merged into master branch.

Thanks for your contribution [~ryu_kobayashi] . And Thanks for your review 
[~aturoczy] 

> StringExpr ArrayIndexOutOfBoundsException with LIKE '%xxx%'
> ---
>
> Key: HIVE-26713
> URL: https://issues.apache.org/jira/browse/HIVE-26713
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Affects Versions: All Versions
>Reporter: Ryu Kobayashi
>Assignee: Ryu Kobayashi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When LIKE(%xxx%) search is performed, if the character string contains 
> control characters, overflow occurs as follows.
> https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringExpr.java#L345
> {code:java}
> // input[next] == -1
> // shift[input[next] & MAX_BYTE] == 255
> next += shift[input[next] & MAX_BYTE]; {code}
>  
> Stack trace:
> {code:java}
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1665986828766_64791_1_00_00_3:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
> 2 at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:220)
> 3 at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:177)
> 4 at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:479)
> 5 at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> 6 at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> 7 at java.security.AccessController.doPrivileged(Native Method)
> 8 at javax.security.auth.Subject.doAs(Subject.java:422)
> 9 at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
> 10at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> 11at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> 12at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> 13at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> 14at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> 15at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> 16at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 17at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 18at java.lang.Thread.run(Thread.java:750)
> 19Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> 20at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
> 21at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
> 22at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> 23at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:194)
> 24... 16 more
> 25Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row 
> 26at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:883)
> 27at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
> 28... 19 more
> 29Caused by: java.lang.ArrayIndexOutOfBoundsException: 255
> 30at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.StringExpr$BoyerMooreHorspool.find(StringExpr.java:409)
> 31at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.AbstractFilterStringColLikeStringScalar$MiddleChecker.index(AbstractFilterStringColLikeStringScalar.java:314)
> 32at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.AbstractFilterStringColLikeStringScalar$MiddleChecker.check(AbstractFilterStringColLikeStringScalar.java:307)
> 33at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.AbstractFilterStringColLikeStringScalar.evaluate(AbstractFilterStringColLikeStringScalar.java:115)
> 34at 
> 

[jira] [Commented] (HIVE-28018) Don't require HiveConf for JDBC Metadata calls

2024-01-22 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809621#comment-17809621
 ] 

Ayush Saxena commented on HIVE-28018:
-

Some discussion around the same code,

[https://github.com/apache/hive/pull/1443#discussion_r479722459]

The value should be present, when opening session with HS2 also the config 
value should have been sent back, It was added in ThriftCLIService, 
OpenService. I doubt if hiveconf: prefix isn't creating issue...

> Don't require HiveConf for JDBC Metadata calls
> --
>
> Key: HIVE-28018
> URL: https://issues.apache.org/jira/browse/HIVE-28018
> Project: Hive
>  Issue Type: Improvement
>Reporter: Kurt Deschler
>Assignee: Kurt Deschler
>Priority: Major
>
> The following JDBC call will throw and exception if hive.default.nulls.last 
> is not set in HiveConf:
> conn.getMetaData().nullsAreSortedLow();



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28018) Don't require HiveConf for JDBC Metadata calls

2024-01-22 Thread Kurt Deschler (Jira)
Kurt Deschler created HIVE-28018:


 Summary: Don't require HiveConf for JDBC Metadata calls
 Key: HIVE-28018
 URL: https://issues.apache.org/jira/browse/HIVE-28018
 Project: Hive
  Issue Type: Improvement
Reporter: Kurt Deschler
Assignee: Kurt Deschler


The following JDBC call will throw and exception if hive.default.nulls.last is 
not set in HiveConf:

conn.getMetaData().nullsAreSortedLow();



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28016) Iceberg: NULL column values handling in COW mode

2024-01-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28016:
--
Labels: pull-request-available  (was: )

> Iceberg: NULL column values handling in COW mode
> 
>
> Key: HIVE-28016
> URL: https://issues.apache.org/jira/browse/HIVE-28016
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-28017) Add generated protobuf code

2024-01-22 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HIVE-28017:
---

Assignee: Ayush Saxena

> Add generated protobuf code
> ---
>
> Key: HIVE-28017
> URL: https://issues.apache.org/jira/browse/HIVE-28017
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>
> HIVE-26790 upgraded protobuf, but didn't generate the code wrt the newer 
> version



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26790) Upgrade to protobuf 3.21.7 / grpc 1.51.0; build on Apple M1

2024-01-22 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809498#comment-17809498
 ] 

Ayush Saxena commented on HIVE-26790:
-

this didn't regenrate the code, [~henrib] / [~dengzh] I have created a ticket 
for that...

I hope we don't see any failures there

> Upgrade to protobuf 3.21.7 / grpc 1.51.0; build on Apple M1
> ---
>
> Key: HIVE-26790
> URL: https://issues.apache.org/jira/browse/HIVE-26790
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 4.0.0-alpha-2
>Reporter: Henri Biestro
>Assignee: Henri Biestro
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Whilst trying to build for Apple M1, bumped into grpc (1.24.0) missing 
> binaries for Apple M1.
> Updating to grpc 1.51.0 and protobuf 3.21.7 solves the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-28016) Iceberg: NULL column values handling in COW mode

2024-01-22 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-28016:
-

Assignee: Denys Kuzmenko

> Iceberg: NULL column values handling in COW mode
> 
>
> Key: HIVE-28016
> URL: https://issues.apache.org/jira/browse/HIVE-28016
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28017) Add generated protobuf code

2024-01-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28017:
--
Labels: pull-request-available  (was: )

> Add generated protobuf code
> ---
>
> Key: HIVE-28017
> URL: https://issues.apache.org/jira/browse/HIVE-28017
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>
> HIVE-26790 upgraded protobuf, but didn't generate the code wrt the newer 
> version



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28017) Add generated protobuf code

2024-01-22 Thread Ayush Saxena (Jira)
Ayush Saxena created HIVE-28017:
---

 Summary: Add generated protobuf code
 Key: HIVE-28017
 URL: https://issues.apache.org/jira/browse/HIVE-28017
 Project: Hive
  Issue Type: Bug
Reporter: Ayush Saxena


HIVE-26790 upgraded protobuf, but didn't generate the code wrt the newer version



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28016) Iceberg: NULL column values handling in COW mode

2024-01-22 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-28016:
--
Summary: Iceberg: NULL column values handling in COW mode  (was: Iceberg: 
NULL column values handling)

> Iceberg: NULL column values handling in COW mode
> 
>
> Key: HIVE-28016
> URL: https://issues.apache.org/jira/browse/HIVE-28016
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28016) Iceberg: NULL column values handling

2024-01-22 Thread Denys Kuzmenko (Jira)
Denys Kuzmenko created HIVE-28016:
-

 Summary: Iceberg: NULL column values handling
 Key: HIVE-28016
 URL: https://issues.apache.org/jira/browse/HIVE-28016
 Project: Hive
  Issue Type: Sub-task
Reporter: Denys Kuzmenko






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27991) Utilise FanoutWriters when inserting records in an Iceberg table when the records are unsorted

2024-01-22 Thread Sourabh Badhya (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809460#comment-17809460
 ] 

Sourabh Badhya commented on HIVE-27991:
---

Merged to master.
Thanks [~zhangbutao] for the review.

> Utilise FanoutWriters when inserting records in an Iceberg table when the 
> records are unsorted
> --
>
> Key: HIVE-27991
> URL: https://issues.apache.org/jira/browse/HIVE-27991
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>
> FanoutWriter is a writer in Iceberg which can be used to write records in a 
> table. This writer keeps all the file handles open, until the write is 
> finished. FanoutWriters is used as the writer when the incoming records are 
> unsorted. We can by default have some mechanism to switch to using 
> FanoutWriters instead of ClusteredWriters when custom sort expressions are 
> not present for the given table/query.
> Similar stuff is already implemented in Spark - 
> [https://github.com/apache/iceberg/pull/8621]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27991) Utilise FanoutWriters when inserting records in an Iceberg table when the records are unsorted

2024-01-22 Thread Sourabh Badhya (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Badhya resolved HIVE-27991.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Utilise FanoutWriters when inserting records in an Iceberg table when the 
> records are unsorted
> --
>
> Key: HIVE-27991
> URL: https://issues.apache.org/jira/browse/HIVE-27991
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> FanoutWriter is a writer in Iceberg which can be used to write records in a 
> table. This writer keeps all the file handles open, until the write is 
> finished. FanoutWriters is used as the writer when the incoming records are 
> unsorted. We can by default have some mechanism to switch to using 
> FanoutWriters instead of ClusteredWriters when custom sort expressions are 
> not present for the given table/query.
> Similar stuff is already implemented in Spark - 
> [https://github.com/apache/iceberg/pull/8621]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-28009) Shared work optimizer ignores schema merge setting in case of virtual column difference

2024-01-22 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-28009.
---
Fix Version/s: 4.1.0
   Resolution: Fixed

Merged to master. Thanks [~dkuzmenko] for review!

> Shared work optimizer ignores schema merge setting in case of virtual column 
> difference
> ---
>
> Key: HIVE-28009
> URL: https://issues.apache.org/jira/browse/HIVE-28009
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0, 4.0.0-beta-1
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> {code:java}
> set hive.optimize.shared.work.merge.ts.schema=false;
> create table t1(a int);
> explain
> WITH t AS (
>   select BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, a from (
> select BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, a, row_number() 
> OVER (partition by INPUT__FILE__NAME) rn from t1
> where a = 1
>   ) q
>   where rn=1
> )
> select BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, a from t1 where NOT (a 
> = 1) AND INPUT__FILE__NAME IN (select INPUT__FILE__NAME from t)
> union all
> select * from t
> {code}
> Before SharedWorkOptimizer:
> {code:java}
> TS[0]-FIL[32]-SEL[2]-RS[14]-MERGEJOIN[42]-SEL[17]-UNION[27]-FS[29]
> TS[3]-FIL[34]-RS[5]-SEL[6]-PTF[7]-FIL[33]-SEL[8]-GBY[13]-RS[15]-MERGEJOIN[42]
> TS[18]-FIL[36]-RS[20]-SEL[21]-PTF[22]-FIL[35]-SEL[23]-UNION[27]
> {code}
> After SharedWorkOptimizer:
> {code:java}
> TS[0]-FIL[32]-SEL[2]-RS[14]-MERGEJOIN[42]-SEL[17]-UNION[27]-FS[29]
>  -FIL[34]-RS[5]-SEL[6]-PTF[7]-FIL[33]-SEL[8]-GBY[13]-RS[15]-MERGEJOIN[42]
> TS[18]-FIL[36]-RS[20]-SEL[21]-PTF[22]-FIL[35]-SEL[23]-UNION[27]
> {code}
> TS[3] and TS[18] are merged but their schema doesn't match and 
> {{hive.optimize.shared.work.merge.ts.schema}} was turned off in the test
> {code:java}
> TS[3]: 0 = FILENAME
> TS[18]: 0 = BLOCKOFFSET,  FILENAME
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-22 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-28015:
--
Description: 
Some writer engines require primary keys on a table so that they can use them 
for writing equality deletes (only the PK cols are written to the eq-delete 
files).

Hive currently doesn't reject setting PKs for Iceberg tables, however, it just 
ignores them. This succeeds:

{code}
create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
{code}

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code}
> create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-22 Thread Denys Kuzmenko (Jira)
Denys Kuzmenko created HIVE-28015:
-

 Summary: Iceberg: Add identifier-field-ids support in Hive
 Key: HIVE-28015
 URL: https://issues.apache.org/jira/browse/HIVE-28015
 Project: Hive
  Issue Type: Improvement
Reporter: Denys Kuzmenko






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28013) No space left on device when running precommit tests

2024-01-22 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809360#comment-17809360
 ] 

Stamatis Zampetakis commented on HIVE-28013:


OK I will restore the days to 60 for now.

Indeed, master should not be completely broken :D You are right we should 
reconsider the impact of HIVE-27716. Should we keep discussing here or create 
dedicated follow-up?

> No space left on device when running precommit tests
> 
>
> Key: HIVE-28013
> URL: https://issues.apache.org/jira/browse/HIVE-28013
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Blocker
> Fix For: 4.1.0
>
> Attachments: orphaned_item_strategy.png
>
>
> The Hive precommit tests fail due to lack of space. Few of the most recent 
> failures below:
> * 
> http://ci.hive.apache.org/job/hive-precommit/view/change-requests/job/PR-4744/23/console
> * 
> http://ci.hive.apache.org/job/hive-precommit/view/change-requests/job/PR-5005/10/console
> {noformat}
> java.io.IOException: No space left on device
>   at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at 
> java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)
>   at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
>   at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:79)
>   at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:280)
>   at 
> org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverWriter.(RiverWriter.java:109)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:560)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:537)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgramIfPossible(CpsThreadGroup.java:520)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:444)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:97)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:315)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:279)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
>   at 
> jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
>   at 
> jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28013) No space left on device when running precommit tests

2024-01-22 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809345#comment-17809345
 ] 

Zoltan Haindrich commented on HIVE-28013:
-

I believe the cleanup runs during the daily repo scan

> I checked the sizes of builds for master from 2021 to now and I didn't see 
> any huge spikes. It was always around 100M as I noted in a comment above.

I think those lines are about 10 *failed* tests - I think on the master there 
don't supposed to be failed tests :D 

> No space left on device when running precommit tests
> 
>
> Key: HIVE-28013
> URL: https://issues.apache.org/jira/browse/HIVE-28013
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Blocker
> Fix For: 4.1.0
>
> Attachments: orphaned_item_strategy.png
>
>
> The Hive precommit tests fail due to lack of space. Few of the most recent 
> failures below:
> * 
> http://ci.hive.apache.org/job/hive-precommit/view/change-requests/job/PR-4744/23/console
> * 
> http://ci.hive.apache.org/job/hive-precommit/view/change-requests/job/PR-5005/10/console
> {noformat}
> java.io.IOException: No space left on device
>   at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at 
> java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)
>   at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
>   at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:79)
>   at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:280)
>   at 
> org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverWriter.(RiverWriter.java:109)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:560)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:537)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgramIfPossible(CpsThreadGroup.java:520)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:444)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:97)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:315)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:279)
>   at 
> org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
>   at 
> jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
>   at 
> jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)