[jira] [Commented] (HIVE-22755) Cleaner/Compaction can skip the read locks and use the min open txn id

2020-01-22 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020942#comment-17020942
 ] 

Peter Vary commented on HIVE-22755:
---

In my deleted comment I misunderstood the purpose of the jira :(
The read lock is needed so there is no concurrent DDL command running parallel 
the compaction. Before having versioned table/partition metadata, removing this 
read lock could be problematic.

> Cleaner/Compaction can skip the read locks and use the min open txn id
> --
>
> Key: HIVE-22755
> URL: https://issues.apache.org/jira/browse/HIVE-22755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Slim Bouguerra
>Priority: Major
> Fix For: 4.0.0
>
>
> The minOpenTxnId is used by the Cleaner here
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java#L154
> This currently converts it to open write-ids to clean appropriately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22755) Cleaner/Compaction can skip the read locks and use the min open txn id

2020-01-21 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020832#comment-17020832
 ] 

Peter Vary commented on HIVE-22755:
---

I have 2 ideas:
* Time based approach: use the start time of the minOpenTxn to filter the files 
to clean
* Instead of skipping every read lock, create a read lock per table instead of 
per partition. This would fix some failures created by the previous patch with 
the added benefit of knowing what tables are under read.

CC: [~lpinter], [~dkuzmenko]

> Cleaner/Compaction can skip the read locks and use the min open txn id
> --
>
> Key: HIVE-22755
> URL: https://issues.apache.org/jira/browse/HIVE-22755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Slim Bouguerra
>Priority: Major
> Fix For: 4.0.0
>
>
> The minOpenTxnId is used by the Cleaner here
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java#L154
> This currently converts it to open write-ids to clean appropriately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22755) Cleaner/Compaction can skip the read locks and use the min open txn id

2020-01-21 Thread Slim Bouguerra (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020594#comment-17020594
 ] 

Slim Bouguerra commented on HIVE-22755:
---

cc [~t3rmin4t0r] please feel free to add more insights about your idea on how 
the Cleaner can skip the read lock.

> Cleaner/Compaction can skip the read locks and use the min open txn id
> --
>
> Key: HIVE-22755
> URL: https://issues.apache.org/jira/browse/HIVE-22755
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Slim Bouguerra
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)