[ 
https://issues.apache.org/jira/browse/HADOOP-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534985
 ] 

Konstantin Shvachko commented on HADOOP-2044:
---------------------------------------------

The patch fixes 2 important bugs:
# access and modification of FSNamesystem.sortedLeases should be synchronized 
(under a lock, which is different from the global FSNamesystem lock).
The bug was that startFileInternal() modified sortedLeases  with the global 
lock but without the leases lock.
# completeFile() or getAdditionalBlock() cannot rely on that they always deal 
with a file under construction, because if client fails to remove the lease, 
the file is automatically converted into a regular file, and modifications of 
such files are not allowed.

I have 2 comments, which do not address the correctness of the patch but rather 
intended to make code more understandable.
# instead of using the "instaceof INodeFileUnderConstruction" operator it is 
better to define an virtual method INode.isFileUnderConstruction().
# All modifications of the FSNamesystem.leases member are performed under the 
global FSNamesystem lock.
While modifications of FSNamesystem.sortedLeases  are done under the lock 
associated with the FSNamesystem.leases, which
is correct but unintuitive. I'd propose to replace all 
{code}synchronized (leases) {}
{code}
sections by
{code}synchronized (sortedLeases) {}
{code}

> Namenode encounters ClassCastException exceptions for 
> INodeFileUnderConstruction
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-2044
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2044
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: InodeClassException.patch
>
>
> A distcp command running on one 400 node cluster shows this exception:
> org.apache.hadoop.fs.FSNamesystem: Removing lease [Lease.  Holder: 44 46 53 
> 43 6c 69 65 6e 74 5f 74 61 73 6b 5f 32 30 30 37 31 30 31 31 32 32 35 37 5f 30 
> 30 30 33 5f 6d 5f 30 30 30 30 39 32 5f 30, heldlocks: 0, pendingcreates: 0], 
> leases remaining: 736
> org.apache.hadoop.dfs.StateChange: DIR* NameSystem.internalReleaseCreate: 
> attempt to release a create lock on 
> /user/xxxx/logs_21/_task_200710112257_0003_m_000027_0/part-00027 file does 
> not exist.
>  org.apache.hadoop.fs.FSNamesystem: Removing lease [Lease.  Holder: 44 46 53 
> 43 6c 69 65 6e 74 5f 74 61 73 6b 5f 32 30 30 37 31 30 31 31 32 32 35 37 5f 30 
> 30 30 33 5f 6d 5f 30 30 30 30 32 37 5f 30, heldlocks: 0, pendingcreates: 0], 
> leases remaining: 735
>  org.apache.hadoop.fs.FSNamesystem: java.lang.ClassCastException: 
> org.apache.hadoop.dfs.INodeFile cannot be cast to 
> org.apache.hadoop.dfs.INodeFileUnderConstruction
>         at 
> org.apache.hadoop.dfs.FSNamesystem.internalReleaseCreate(FSNamesystem.java:1566)
>         at org.apache.hadoop.dfs.FSNamesystem.access$100(FSNamesystem.java:51)
>         at 
> org.apache.hadoop.dfs.FSNamesystem$Lease.releaseLocks(FSNamesystem.java:1463)
>         at 
> org.apache.hadoop.dfs.FSNamesystem$LeaseMonitor.run(FSNamesystem.java:1525)
>         at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to