[jira] [Commented] (KUDU-2153) Servers delete tmp files before obtaining directory lock

2018-03-12 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396368#comment-16396368
 ] 

Todd Lipcon commented on KUDU-2153:
---

I took a simple approach with a patch on https://gerrit.cloudera.org/c/9596/ . 
Like you said, the "wrong" files/directories are locked, but we can simply 
change the ordering at startup to circumvent the issue.

> Servers delete tmp files before obtaining directory lock
> 
>
> Key: KUDU-2153
> URL: https://issues.apache.org/jira/browse/KUDU-2153
> Project: Kudu
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 1.2.0, 1.3.1, 1.4.0, 1.5.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>
> In FsManager::Open() we currently call DeleteTmpFiles very early, before 
> starting the block manager. This means that, if you accidentally start a 
> tserver while another is running, it's possible for it to delete temporary 
> files that are in-use by the running tserver, causing it to exhibit strange 
> behavior, crash, etc (as in KUDU-2152).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2153) Servers delete tmp files before obtaining directory lock

2017-09-20 Thread Adar Dembo (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174053#comment-16174053
 ] 

Adar Dembo commented on KUDU-2153:
--

That's unfortunate, though the block manager's locks only protect the data 
directories, and FsManager::DeleteTmpFiles doesn't actually walk the data 
directories.

So I'm not disagreeing with the premise of the bug, just pointing out that data 
directory locking won't help here; we'd need to introduce new locking on the 
other "special" directories (i.e. wals/, cmeta/, and tablet_meta/). Or change 
the block manager locking to lock the root instances rather than the data 
directory instances.


> Servers delete tmp files before obtaining directory lock
> 
>
> Key: KUDU-2153
> URL: https://issues.apache.org/jira/browse/KUDU-2153
> Project: Kudu
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 1.2.0, 1.3.1, 1.4.0, 1.5.0
>Reporter: Todd Lipcon
>
> In FsManager::Open() we currently call DeleteTmpFiles very early, before 
> starting the block manager. This means that, if you accidentally start a 
> tserver while another is running, it's possible for it to delete temporary 
> files that are in-use by the running tserver, causing it to exhibit strange 
> behavior, crash, etc (as in KUDU-2152).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)