[
https://issues.apache.org/jira/browse/HIVE-26716?focusedWorklogId=827207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-827207
]
ASF GitHub Bot logged work on HIVE-26716:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 18/Nov/22 15:42
Start Date: 18/Nov/22 15:42
Worklog Time Spent: 10m
Work Description: veghlaci05 commented on code in PR #3746:
URL: https://github.com/apache/hive/pull/3746#discussion_r1026589274
##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java:
##########
@@ -402,20 +413,14 @@ protected Boolean findNextCompactionAndExecute(boolean
collectGenericStats, bool
todo Find a more generic approach to collecting files in the same
logical bucket to compact within the same
task (currently we're using Tez split grouping).
*/
- QueryCompactor queryCompactor =
QueryCompactorFactory.getQueryCompactor(t, conf, ci);
- computeStats = (queryCompactor == null && collectMrStats) ||
collectGenericStats;
+ Compactor compactor = compactorFactory.getQueryCompactor(msc, t, conf,
ci);
+ computeStats = (compactor == null && collectMrStats) ||
collectGenericStats;
LOG.info("Starting " + ci.type.toString() + " compaction for " +
ci.getFullPartitionName() + ", id:" +
ci.id + " in " + compactionTxn + " with compute stats set to "
+ computeStats);
+ LOG.info("Will compact id: " + ci.id + " with compactor class: " +
compactor.getClass().getName());
- if (queryCompactor != null) {
- LOG.info("Will compact id: " + ci.id + " with query-based compactor
class: "
- + queryCompactor.getClass().getName());
- queryCompactor.runCompaction(conf, t, p, sd, tblValidWriteIds, ci,
dir);
- } else {
- LOG.info("Will compact id: " + ci.id + " via MR job");
- runCompactionViaMrJob(ci, t, p, sd, tblValidWriteIds, jobName, dir);
- }
+ compactor.runCompaction(conf, t, p, sd, tblValidWriteIds, ci, dir);
Review Comment:
compactorFactory no longer returns null
##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java:
##########
@@ -402,20 +413,14 @@ protected Boolean findNextCompactionAndExecute(boolean
collectGenericStats, bool
todo Find a more generic approach to collecting files in the same
logical bucket to compact within the same
task (currently we're using Tez split grouping).
*/
- QueryCompactor queryCompactor =
QueryCompactorFactory.getQueryCompactor(t, conf, ci);
- computeStats = (queryCompactor == null && collectMrStats) ||
collectGenericStats;
+ Compactor compactor = compactorFactory.getQueryCompactor(msc, t, conf,
ci);
+ computeStats = (compactor == null && collectMrStats) ||
collectGenericStats;
LOG.info("Starting " + ci.type.toString() + " compaction for " +
ci.getFullPartitionName() + ", id:" +
ci.id + " in " + compactionTxn + " with compute stats set to "
+ computeStats);
+ LOG.info("Will compact id: " + ci.id + " with compactor class: " +
compactor.getClass().getName());
- if (queryCompactor != null) {
- LOG.info("Will compact id: " + ci.id + " with query-based compactor
class: "
- + queryCompactor.getClass().getName());
- queryCompactor.runCompaction(conf, t, p, sd, tblValidWriteIds, ci,
dir);
- } else {
- LOG.info("Will compact id: " + ci.id + " via MR job");
- runCompactionViaMrJob(ci, t, p, sd, tblValidWriteIds, jobName, dir);
- }
+ compactor.runCompaction(conf, t, p, sd, tblValidWriteIds, ci, dir);
Review Comment:
Fixed, compactorFactory no longer returns null.
Issue Time Tracking
-------------------
Worklog Id: (was: 827207)
Time Spent: 4h 20m (was: 4h 10m)
> Query based Rebalance compaction on full acid tables
> ----------------------------------------------------
>
> Key: HIVE-26716
> URL: https://issues.apache.org/jira/browse/HIVE-26716
> Project: Hive
> Issue Type: Sub-task
> Components: Hive
> Reporter: László Végh
> Assignee: László Végh
> Priority: Major
> Labels: ACID, compaction, pull-request-available
> Time Spent: 4h 20m
> Remaining Estimate: 0h
>
> Support rebalancing compaction on fully ACID tables.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)