[
https://issues.apache.org/jira/browse/HIVE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758063#comment-16758063
]
Seung-Hyun Cheong edited comment on HIVE-21194 at 2/13/19 8:29 AM:
-------------------------------------------------------------------
[~bslim]
I'm using HDP 3.1.1. (Hive 3.1.0, Druid 0.12.1)
A query to insert data from HDFS to druid.
{code:java}
INSERT
INTO
TABLE druid.data_table
SELECT
`time` AS `__time`,
.
.
.
FROM
hdfs.data_table
WHERE
.
.
.{code}
The inserted segment meta
!image-2019-02-01-16-31-56-958.png!
* The interval of the segment: UTC (Green one)
* The version of the segment: UTC+9 (Red one, it's KST.)
To delete one of the segments I inserted
{code:java}
// Disabling the segment
DELETE /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}
// Deleting the segment
DELETE
/druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}{code}
Then the exception occurs
{code:java}
2019-01-30T16:58:35,354 ERROR [task-runner-0-priority-0]
io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running
task[KillTask{id=kill_upload_2018-12-31T00:00:00.000Z_2019-02-05T00:00:00.000Z_2019-02-01T16:52:31.851Z,
type=kill, dataSource=upload}]
io.druid.java.util.common.ISE: WTF?! Unused
segment[upload_2019-01-01T00:00:00.000Z_2019-01-02T00:00:00.000Z_2019-01-31T01:12:32.289+09:00]
has version[2019-01-31T01:12:32.289+09:00] > task
version[2019-01-30T16:58:29.992Z]
at io.druid.indexing.common.task.KillTask.run(KillTask.java:94)
~[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
at
io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444)
[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
at
io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416)
[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
{code}
So, in KST(UTC+9), I can't delete a segment for 9 hours...
In Druid, a segment published by IndexTask has a version UTC (below)
!image-2019-02-01-16-32-17-093.png!
* The interval of the segment: UTC
* The version of the segment: UTC
So, there is no such problem.
And, I didn't test my patch actually... sorry for that. I'll submit a patch
again after testing. (Currently I deleted my patch)
(QA failed, because I forgot the import statement on my patch. "import
org.joda.time.DateTimeZone;")
was (Author: seunghyun.cheong):
[~bslim]
I'm using HDP 3.1.0. (Hive 3.1.0, Druid 0.12.1)
A query to insert data from HDFS to druid.
{code:java}
INSERT
INTO
TABLE druid.data_table
SELECT
`time` AS `__time`,
.
.
.
FROM
hdfs.data_table
WHERE
.
.
.{code}
The inserted segment meta
!image-2019-02-01-16-31-56-958.png!
* The interval of the segment: UTC (Green one)
* The version of the segment: UTC+9 (Red one, it's KST.)
To delete one of the segments I inserted
{code:java}
// Disabling the segment
DELETE /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}
// Deleting the segment
DELETE
/druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}{code}
Then the exception occurs
{code:java}
2019-01-30T16:58:35,354 ERROR [task-runner-0-priority-0]
io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running
task[KillTask{id=kill_upload_2018-12-31T00:00:00.000Z_2019-02-05T00:00:00.000Z_2019-02-01T16:52:31.851Z,
type=kill, dataSource=upload}]
io.druid.java.util.common.ISE: WTF?! Unused
segment[upload_2019-01-01T00:00:00.000Z_2019-01-02T00:00:00.000Z_2019-01-31T01:12:32.289+09:00]
has version[2019-01-31T01:12:32.289+09:00] > task
version[2019-01-30T16:58:29.992Z]
at io.druid.indexing.common.task.KillTask.run(KillTask.java:94)
~[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
at
io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444)
[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
at
io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416)
[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
{code}
So, in KST(UTC+9), I can't delete a segment for 9 hours...
In Druid, a segment published by IndexTask has a version UTC (below)
!image-2019-02-01-16-32-17-093.png!
* The interval of the segment: UTC
* The version of the segment: UTC
So, there is no such problem.
And, I didn't test my patch actually... sorry for that. I'll submit a patch
again after testing. (Currently I deleted my patch)
(QA failed, because I forgot the import statement on my patch. "import
org.joda.time.DateTimeZone;")
> DruidStorageHandler should set a version of segment to UTC
> ----------------------------------------------------------
>
> Key: HIVE-21194
> URL: https://issues.apache.org/jira/browse/HIVE-21194
> Project: Hive
> Issue Type: Bug
> Components: Druid integration
> Affects Versions: 3.1.0
> Reporter: Seung-Hyun Cheong
> Assignee: Seung-Hyun Cheong
> Priority: Minor
> Attachments: image-2019-02-01-16-31-56-958.png,
> image-2019-02-01-16-32-17-093.png
>
>
> h1. Exception while running a KillTask
> {code:java}
> 2019-01-30T16:58:35,354 ERROR [task-runner-0-priority-0]
> io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running
> task[KillTask{id=kill_upload_2018-12-31T00:00:00.000Z_2019-02-05T00:00:00.000Z_2019-02-01T16:52:31.851Z,
> type=kill, dataSource=upload}]
> io.druid.java.util.common.ISE: WTF?! Unused
> segment[upload_2019-01-01T00:00:00.000Z_2019-01-02T00:00:00.000Z_2019-01-31T01:12:32.289+09:00]
> has version[2019-01-31T01:12:32.289+09:00] > task
> version[2019-01-30T16:58:29.992Z]
> at io.druid.indexing.common.task.KillTask.run(KillTask.java:94)
> ~[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
> at
> io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444)
> [druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
> at
> io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416)
> [druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [?:1.8.0_112]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_112]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> {code}
>
> h1. Reason
> h3. KillTask compares versions
> [KillTask.java#L88|https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillTask.java#L88]
> {code:java}
> if (unusedSegment.getVersion().compareTo(myLock.getVersion()) > 0) {
> throw new ISE(
> "WTF?! Unused segment[%s] has version[%s] > task version[%s]",
> unusedSegment.getId(),
> unusedSegment.getVersion(),
> myLock.getVersion()
> );
> }
> {code}
>
> h3. KillTask version (UTC, e.g. "2019-01-30T16:58:29.992Z")
> [TaskLockbox.java#L593|https://github.com/apache/incubator-druid/blob/8eae26fd4e7572060d112864dd3d5f6a865b9c89/indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskLockbox.java#L593]
> {code:java}
> version = DateTimes.nowUtc().toString();
> {code}
>
> h3. Segment version (UTC+9, e.g. "2019-01-31T01:12:32.289+09:00")
> [DruidStorageHandler.java#L755|https://github.com/apache/hive/blob/master/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java#L755]
> {code:java}
> jobProperties.put(DruidConstants.DRUID_SEGMENT_VERSION, new
> DateTime().toString());
> {code}
>
>
> h1. Suggestion
> h3. Because druid uses UTC only, DruidStorageHandler should set a version of
> segment to UTC.
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)