Juhani, 

Please try the patch attached and let me know if it works. You were right, we 
did not update the checkpoint rebuilder when we changed the data file format. 


Thanks,
Hari

-- 
Hari Shreedharan


On Tuesday, February 26, 2013 at 11:28 PM, Hari Shreedharan wrote:

> Juhani, 
> 
> I think I know what the issue is. I will send a patch in a few minutes. You 
> can give it a try and let me know if it works.
> 
> 
> Thanks,
> Hari
> -- 
> Hari Shreedharan
> 
> 
> On Tuesday, February 26, 2013 at 10:42 PM, Juhani Connolly wrote:
> 
> > Having attempted to run this on logs whose checkpoint was messed up, 
> > running flume started up the checkpoint rebuild process again.
> > 
> > So it doesn't really look like it is working. I suspect that it is not 
> > capable of recognizing v3 logs but I may be wrong. I'd like to bring 
> > this up to date, but would appreciate it if someone could give me a 
> > heads up on roughly what it would involve, and which code to have a look at.
> > 
> > On 02/26/2013 06:08 PM, Juhani Connolly wrote:
> > > Hi Hari,
> > > 
> > > Here's the logs. Also adding the patch afterwards. Don't think the 
> > > mailinglist accepts attachments so just going to paste it in after the 
> > > logs(note that I changed the parameter for the checkpoint dir so it 
> > > didn't overlap with conf)
> > > 
> > > # first run
> > > $ sudo su cy_flume -c "JAVA_HOME=/usr/local/java ./flume-ng cp-rebuild 
> > > -c /etc/flume/conf -h /tmp/flume-check -l /tmp/flume-data/ -t 5000000"
> > > Info: Sourcing environment configuration script 
> > > /etc/flume/conf/flume-env.sh (http://flume-env.sh)
> > > + exec /usr/local/java/bin/java -server 
> > > -XX:OnOutOfMemoryError=/tmp/stop.sh (http://stop.sh) -XX:MaxPermSize=24m 
> > > -XX:PermSize=24m -XX:SurvivorRatio=8 -Xmn96m -Xmx512m -Xms128m 
> > > -Dcom.sun.management.jmxremote 
> > > -Dcom.sun.management.jmxremote.port=12345 
> > > -Dcom.sun.management.jmxremote.ssl=false 
> > > -Dcom.sun.management.jmxremote.authenticate=false 
> > > -Djava.rmi.server.hostname=172.28.202.76 
> > > -Dflume.monitoring.type=GANGLIA 
> > > -Dflume.monitoring.hosts=pat-log-om01:8649 -cp 
> > > '/etc/flume/conf:/usr/lib/flume/lib/*' -Djava.library.path= 
> > > org.apache.flume.channel.file.CheckpointRebuilder -h /tmp/flume-check 
> > > -l /tmp/flume-data/ -t 5000000
> > > Exception in thread "main" java.io.IOException: File 
> > > /tmp/flume-data/log-1.meta has bad version 1c0d0300
> > > at 
> > > org.apache.flume.channel.file.LogFileFactory.getSequentialReader(LogFileFactory.java:169)
> > > at 
> > > org.apache.flume.channel.file.CheckpointRebuilder.rebuild(CheckpointRebuilder.java:68)
> > > at 
> > > org.apache.flume.channel.file.CheckpointRebuilder.main(CheckpointRebuilder.java:257)
> > > 
> > > # second run
> > > 
> > > $ sudo su cy_flume -c "JAVA_HOME=/usr/local/java ./flume-ng cp-rebuild 
> > > -c /etc/flume/conf -h /tmp/flume-check -l /tmp/flume-data/ -t 5000000"
> > > Info: Sourcing environment configuration script 
> > > /etc/flume/conf/flume-env.sh (http://flume-env.sh)
> > > + exec /usr/local/java/bin/java -server 
> > > -XX:OnOutOfMemoryError=/tmp/stop.sh (http://stop.sh) -XX:MaxPermSize=24m 
> > > -XX:PermSize=24m -XX:SurvivorRatio=8 -Xmn96m -Xmx512m -Xms128m 
> > > -Dcom.sun.management.jmxremote 
> > > -Dcom.sun.management.jmxremote.port=12345 
> > > -Dcom.sun.management.jmxremote.ssl=false 
> > > -Dcom.sun.management.jmxremote.authenticate=false 
> > > -Djava.rmi.server.hostname=172.28.202.76 
> > > -Dflume.monitoring.type=GANGLIA 
> > > -Dflume.monitoring.hosts=pat-log-om01:8649 -cp 
> > > '/etc/flume/conf:/usr/lib/flume/lib/*' -Djava.library.path= 
> > > org.apache.flume.channel.file.CheckpointRebuilder -h /tmp/flume-check 
> > > -l /tmp/flume-data/ -t 5000000
> > > $
> > > 
> > > 
> > > diff from here:
> > > 
> > > diff --git a/bin/flume-ng b/bin/flume-ng
> > > index ee86c95..b7174b6 100755
> > > --- a/bin/flume-ng
> > > +++ b/bin/flume-ng
> > > @@ -26,6 +26,7 @@
> > > FLUME_AGENT_CLASS="org.apache.flume.node.Application"
> > > FLUME_AVRO_CLIENT_CLASS="org.apache.flume.client.avro.AvroCLIClient"
> > > FLUME_VERSION_CLASS="org.apache.flume.tools.VersionInfo"
> > > +FLUME_CHECKPOINT_REBUILDER_CLASS="org.apache.flume.channel.file.CheckpointRebuilder"
> > >  
> > > 
> > > 
> > > CLEAN_FLAG=1
> > > ################################
> > > @@ -198,6 +199,11 @@ avro-client options:
> > > --headerFile,-R <file> File containing event headers as key/value 
> > > pairs on each new line
> > > --help,-h display help text
> > > 
> > > +cp-rebuild options:
> > > + -h <dir> Checkpoint directory
> > > + -l <dir>[,<dir>]* Comma-separated list of log directories
> > > + -t <cap> capacity of the channel
> > > +
> > > Either --rpcProps or both --host and --port must be specified.
> > > 
> > > Note that if <conf> directory is specified, then it is always 
> > > included first
> > > @@ -260,6 +266,9 @@ case "$mode" in
> > > opt_version=1
> > > CLEAN_FLAG=0
> > > ;;
> > > + cp-rebuild)
> > > + opt_cp_rebuild=1
> > > + ;;
> > > *)
> > > error "Unknown or unspecified command '$mode'"
> > > echo
> > > @@ -417,6 +426,8 @@ elif [ -n "$opt_avro_client" ] ; then
> > > run_flume $FLUME_AVRO_CLIENT_CLASS $args
> > > elif [ -n "${opt_version}" ] ; then
> > > run_flume $FLUME_VERSION_CLASS $args
> > > +elif [ -n "${opt_cp_rebuild}" ] ; then
> > > + run_flume $FLUME_CHECKPOINT_REBUILDER_CLASS $args
> > > else
> > > error "This message should never appear" 1
> > > fi
> > > diff --git 
> > > a/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CheckpointRebuilder.java
> > >  
> > > b/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CheckpointRebuilder.java
> > >  
> > > 
> > > index 6e64003..55357fa 100644
> > > --- 
> > > a/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CheckpointRebuilder.java
> > > +++ 
> > > b/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CheckpointRebuilder.java
> > > @@ -222,7 +222,7 @@ public class CheckpointRebuilder {
> > > 
> > > public static void main(String[] args) throws Exception {
> > > Options options = new Options();
> > > - Option opt = new Option("c", true, "checkpoint directory");
> > > + Option opt = new Option("h", true, "checkpoint directory");
> > > opt.setRequired(true);
> > > options.addOption(opt);
> > > opt = new Option("l", true, "comma-separated list of log 
> > > directories");
> > > @@ -234,7 +234,7 @@ public class CheckpointRebuilder {
> > > options.addOption(opt);
> > > CommandLineParser parser = new GnuParser();
> > > CommandLine cli = parser.parse(options, args);
> > > - File checkpointDir = new File(cli.getOptionValue("c"));
> > > + File checkpointDir = new File(cli.getOptionValue("h"));
> > > String[] logDirs = cli.getOptionValue("l").split(",");
> > > List<File> logFiles = Lists.newArrayList();
> > > for (String logDir : logDirs) {
> > > 
> > > 
> > > On 02/26/2013 05:27 PM, Hari Shreedharan wrote:
> > > > Hi Juhani,
> > > > 
> > > > Could you please send the full logs? I don't remember if the main 
> > > > method was updated with the new file channel format and also with the 
> > > > encryption patch. The CheckpointRebuilder works fine if you call from 
> > > > the channel starup (using the use-fast-replay parameter).
> > > > 
> > > > 
> > > > Thanks,
> > > > Hari
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> > 
> 
> 

Reply via email to