[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9516:
-
Fix Version/s: 2.8.0

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1
>
> Attachments: HDFS-9516_1.patch, HDFS-9516_2.patch, HDFS-9516_3.patch, 
> HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-17 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-9516:
--
Fix Version/s: (was: 2.9.0)
   2.7.3

Committed to branch-2.8 and branch-2.7.

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Fix For: 2.7.3
>
> Attachments: HDFS-9516_1.patch, HDFS-9516_2.patch, HDFS-9516_3.patch, 
> HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-16 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9516:
--
Target Version/s: 2.8.0, 2.7.3

[~shv], unfortunately this came in too late for 2.7.2. That said, I don’t see 
any reason why this shouldn’t be in 2.8.0 and 2.7.3. Setting the 
target-versions accordingly on JIRA.

If you agree, appreciate backport help to those branches (branch-2.8.0, 
branch-2.7).


> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Fix For: 2.9.0
>
> Attachments: HDFS-9516_1.patch, HDFS-9516_2.patch, HDFS-9516_3.patch, 
> HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-15 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-9516:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

I just committed this to trunk and branch-2. Thank you Plamen.

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Fix For: 2.9.0
>
> Attachments: HDFS-9516_1.patch, HDFS-9516_2.patch, HDFS-9516_3.patch, 
> HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-14 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-9516:
---
Attachment: HDFS-9516_3.patch

Attaching "_3" patch which contains an assert that [~shv] wished for.

The assert this time checks that the volume base path is the starting path of 
the block file in order to assure that the block file will not attempt to be 
moved across volumes when finalized.

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-9516_1.patch, HDFS-9516_2.patch, HDFS-9516_3.patch, 
> HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-10 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-9516:
---
Attachment: HDFS-9516_1.patch

Attaching first patch (_1) with proposed fix. I've left the assert in place. 
All unit tests pass locally.

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-9516_1.patch, HDFS-9516_testFailures.patch, 
> Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-10 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-9516:
---
Attachment: HDFS-9516_testFailures.patch

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Attachments: Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-10 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-9516:
---
Attachment: (was: HDFS-9516_testFailures.patch)

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Attachments: Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-10 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-9516:
---
Attachment: HDFS-9516_testFailures.patch

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-10 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-9516:
---
Status: Patch Available  (was: In Progress)

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-9516_1.patch, HDFS-9516_2.patch, 
> HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-10 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-9516:
---
Attachment: HDFS-9516_2.patch

Attaching second patch (_2) with same fix but the assert statement taken out.

Please note I have also removed the try block; please let me know if I should 
have left that back in. It does not seem it was needed anymore when I took a 
look though.

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-9516_1.patch, HDFS-9516_2.patch, 
> HDFS-9516_testFailures.patch, Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-07 Thread Bogdan Raducanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bogdan Raducanu updated HDFS-9516:
--
Attachment: truncate.dn.log

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
> Attachments: Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-07 Thread Bogdan Raducanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bogdan Raducanu updated HDFS-9516:
--
Attachment: Main.java

> truncate file fails with data dirs on multiple disks
> 
>
> Key: HDFS-9516
> URL: https://issues.apache.org/jira/browse/HDFS-9516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
> Attachments: Main.java, truncate.dn.log
>
>
> FileSystem.truncate returns false (no exception) but the file is never closed 
> and not writable after this.
> It seems to be because of copy on truncate which is used because the system 
> is in upgrade state. In this case a rename between devices is attempted.
> See attached log and repro code.
> Probably also affects truncate snapshotted file when copy on truncate is also 
> used.
> Possibly it affects not only truncate but any block recovery.
> I think the problem is in updateReplicaUnderRecovery
> {code}
> ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten(
> newBlockId, recoveryId, rur.getVolume(), 
> blockFile.getParentFile(),
> newlength);
> {code}
> blockFile is created with copyReplicaWithNewBlockIdAndGS which is allowed to 
> choose any volume so rur.getVolume() is not where the block is located.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)