[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-06-22 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053285#comment-13053285
 ] 

ramkrishna.s.vasudevan commented on HDFS-1989:
--

Here in what scenario is it necessary to close the editLogs while saving the 
name space.
W.r.t backup namenode already the edits.new streams has been opened and it is 
not present in the current directory.

Is there any scenario where the editLogs.close() has to be called?
Can we add an additional api which does not have this editLogs.close() and call 
it in checkpointflow alone?
Pls correct me if am wrong.  

 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-06-15 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049710#comment-13049710
 ] 

ramkrishna.s.vasudevan commented on HDFS-1989:
--

Hi Todd,
In the backup name node side during checkpointing 
{noformat}
bnImage.loadCheckpoint(sig);
sig.validateStorageInfo(bnImage);
bnImage.saveCheckpoint();
{noformat}

{noformat}
void saveCheckpoint() throws IOException {
saveNamespace(false);
  }
{noformat}

In savenamespace
{noformat}
  void saveNamespace(boolean renewCheckpointTime) throws IOException {
 
// try to restore all failed edit logs here
assert editLog != null : editLog must be initialized;
storage.attemptRestoreRemovedStorage();

editLog.close();
{noformat}


So here the editlogs are getting closed in the Checkpoint flow.

This is where the problem comes when the client tries to issue a close file 
after editLog.close() is exceuted.



 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-06-15 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049914#comment-13049914
 ] 

Todd Lipcon commented on HDFS-1989:
---

my point is that the client cannot issue a close() at this time, because the 
BNN has diverted its logs from apply mode to spool mode, and clients don't 
talk directly to the BN.

 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-06-15 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049973#comment-13049973
 ] 

Konstantin Shvachko commented on HDFS-1989:
---

Seems like a valid bug.
Client does not directly talk to BN, but NN sends the journal transaction 
(close in the case). And if that kicks in when BN closed edits, but hasn't 
reopened them yet, the exception can happen.
Todd's right the transaction should go into journal spool, but I suspect that 
{{edits.close()}} closes all streams including the spool, and that could be the 
problem.

 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-06-15 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050207#comment-13050207
 ] 

ramkrishna.s.vasudevan commented on HDFS-1989:
--

Hi,

Yes. the editsLog.close() is the problem as it closes all the editStreams 
including the diverted editStreams.  



 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-06-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049380#comment-13049380
 ] 

Todd Lipcon commented on HDFS-1989:
---

saveNamespace should only be happening while in safemode, and closing a file 
should not be allowed while in safemode. This may ahve been addressed by 
HDFS-988.

On the BN side, it shouldn't be taking a checkpoint while the edits application 
is in active mode -- it should be in spool mode where the edits aren't being 
applied.

 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038518#comment-13038518
 ] 

ramkrishna.s.vasudevan commented on HDFS-1989:
--

Why does saveNameSpace api closes all the editLog Streams.

Ideally once checkpoint starts the streams have already been diverted to 
edit.new and the in saveCheckpoint we try to move the current directory 
contents to latestcheckpoint.tmp.

So in what scneario do we require the editLog.close() which results in closing 
of all edit streams?

 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038519#comment-13038519
 ] 

ramkrishna.s.vasudevan commented on HDFS-1989:
--

of syncs: 21 SyncTimes(ms): 130 176 
2011-05-17 14:28:44,921 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Unable to sync edit log.
java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
system changes are not persistent. No journal streams.
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logEdit(FSEditLog.java:1029)
at 
org.apache.hadoop.hdfs.server.namenode.BackupImage.journal(BackupImage.java:247)
at 
org.apache.hadoop.hdfs.server.namenode.BackupNode.journal(BackupNode.java:224)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1419)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1131)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1413)

at org.apache.hadoop.ipc.Client.call(Client.java:1052)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:250)
at $Proxy8.journal(Unknown Source)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogBackupOutputStream.send(EditLogBackupOutputStream.java:181)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogBackupOutputStream.flushAndSync(EditLogBackupOutputStream.java:155)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:84)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:515)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1797)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:896)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1419)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1131)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1413)


 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira