[ 
https://issues.apache.org/jira/browse/CHUKWA-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854357#action_12854357
 ] 

Ahmed Fathalla commented on CHUKWA-4:
-------------------------------------

I made the changes Jerome recommended, I tried it and it seems to be working 
correctly. Please take a look and tell me any comments you might have

public class CopySequenceFile {
  static Logger log = Logger.getLogger(LocalWriter.class);
  private static SequenceFile.Writer seqFileWriter = null;
  private static SequenceFile.Reader seqFileReader = null; 
  private static FSDataOutputStream newOutputStr = null;
  
  public static void main(String args[]){
                
        }
        
  public static void createValidSequenceFile(Configuration conf, String 
originalFileDir, String originalFileName,FileSystem localFs){
    try{
          String originalCompleteDir= originalFileDir + originalFileName;
          Path originalPath= new Path (originalCompleteDir);
          int extensionIndex= originalFileName.indexOf(".chukwa",0);
      String recoverDoneFileName=originalFileName.substring(0, 
extensionIndex)+".recoverDone";
          String recoverDoneDir= originalFileDir + recoverDoneFileName;
          Path recoverDonePath= new Path(recoverDoneDir);
          String recoverFileName=originalFileName.substring(0, 
extensionIndex)+".recover";
          String recoverDir= originalFileDir+ recoverFileName;
          Path recoverPath= new Path (recoverDir);
          String doneFileName=originalFileName.substring(0, 
extensionIndex)+".done";
          String doneDir= originalFileDir+ doneFileName;
          Path donePath= new Path (doneDir);
          
          newOutputStr = localFs.create(recoverPath);
      seqFileWriter = SequenceFile.createWriter(conf, newOutputStr,
        ChukwaArchiveKey.class, ChunkImpl.class,
        SequenceFile.CompressionType.NONE, null);
      seqFileReader = new SequenceFile.Reader (localFs, originalPath, conf);
        
      System.out.println("key class name is " + 
seqFileReader.getKeyClassName());
      System.out.println("value class name is " + 
seqFileReader.getValueClassName());
      ChukwaArchiveKey key = new ChukwaArchiveKey();
      ChunkImpl evt = ChunkImpl.getBlankChunk();
       try{ 
         while (seqFileReader.next(key, evt)){
           seqFileWriter.append(key, evt);
        }
       }
       catch (ChecksumException e){ //The exception occurs when we read a bad 
chunk while copying
         log.warn("Encountered Bad Chunk while copying .chukwa file, 
continuing",e);     
       }
       try{
             localFs.rename(recoverPath, recoverDonePath); //Rename the 
destination file from .recover to .recoverDone 
             localFs.delete(originalPath,false); //Delete Original .chukwa file
             localFs.rename(recoverDonePath, donePath); //rename .recoverDone 
to .done
         }
       catch (Exception e){
         log.warn("Error occured while renaming .recoverDone to .recover or 
deleting .chukwa",e);                
         e.printStackTrace();
         }
       seqFileReader.close();
           seqFileWriter.close();
           newOutputStr.close();
        }

        catch(Exception e){
          log.warn("Error during .chukwa file recovery",e);      
          e.printStackTrace();
        }       
  }
}


> Collectors don't finish writing .done datasink from last .chukwa datasink 
> when stopped using bin/stop-collectors
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: CHUKWA-4
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-4
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>         Environment: I am running on our local cluster. This is a linux 
> machine that I also run Hadoop cluster from.
>            Reporter: Andy Konwinski
>            Priority: Minor
>
> When I use start-collectors, it creates the datasink as expected, writes to 
> it as per normal, i.e. writes to the .chukwa file, and roll overs work fine 
> when it renames the .chukwa file to .done. However, when I use 
> bin/stop-collectors to shut down the running collector it leaves a .chukwa 
> file in the HDFS file system. Not sure if this is a valid sink or not, but I 
> think that the collector should gracefully clean up the datasink and rename 
> it .done before exiting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to