[ 
https://issues.apache.org/jira/browse/FLUME-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588132#comment-13588132
 ] 

Juhani Connolly commented on FLUME-1929:
----------------------------------------

This appears to hang.

Steps followed:
- start up flume, feed some data -kill 9 to try to force an inconsistent 
checkpoint
- delete in-use.lock, checkpoint and checkpoint.meta
- run the checkpoint rebuilder, final command through our script is(not that I 
patched -c to become -h)

+ exec /usr/local/java/bin/java -server -XX:OnOutOfMemoryError=/tmp/stop.sh 
-XX:MaxPermSize=24m -XX:PermSize=24m -XX:SurvivorRatio=8 -Xmn96m -Xmx512m 
-Xms128m -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.port=12345 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Djava.rmi.server.hostname=172.28.202.76 -Dflume.monitoring.type=GANGLIA 
-Dflume.monitoring.hosts=pat-log-om01:8649 -cp 
'/etc/flume/conf:/usr/lib/flume/lib/*' -Djava.library.path= 
org.apache.flume.channel.file.CheckpointRebuilder -h /tmp/flume-check -l 
/tmp/flume-data -t 5000000



Full logs are as below:

27 Feb 2013 17:51:35,995 INFO  [main] 
(org.apache.flume.channel.file.EventQueueBackingStoreFile.<init>:71)  - 
Preallocated /tmp/flume-check/checkpoint to 40008232 for capacity 5000000
27 Feb 2013 17:51:36,004 INFO  [main] 
(org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:47)  - 
Starting up with /tmp/flume-check/checkpoint and 
/tmp/flume-check/checkpoint.meta
27 Feb 2013 17:51:36,078 INFO  [main] 
(org.apache.flume.channel.file.CheckpointRebuilder.rebuild:64)  - Attempting to 
fast replay the log files.
27 Feb 2013 17:51:36,112 INFO  [main] 
(org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize:113)  - 
Unable to get maxDirectMemory from VM: NoSuchMethodException: 
sun.misc.VM.maxDirectMemory(null)
27 Feb 2013 17:51:36,117 INFO  [main] 
(org.apache.flume.tools.DirectMemoryUtils.allocate:47)  - Direct Memory 
Allocation:  Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 
526843904, Remaining = 526843904
27 Feb 2013 17:51:36,866 INFO  [main] 
(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
Encountered EOF at 150457 in /tmp/flume-data/log-3
27 Feb 2013 17:51:36,884 INFO  [main] 
(org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
Encountered EOF at 4095 in /tmp/flume-data/log-4
27 Feb 2013 17:51:36,887 INFO  [main] 
(org.apache.flume.channel.file.CheckpointRebuilder.rebuild:151)  - Replayed 0 
events using fast replay logic.
27 Feb 2013 17:51:36,889 INFO  [main] 
(org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:108)  
- Start checkpoint for /tmp/flume-check/checkpoint, elements to sync = 0
27 Feb 2013 17:51:36,896 INFO  [main] 
(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:120)  - 
Updating checkpoint metadata: logWriteOrderID: 1361955096886, queueSize: 0, 
queueHead: 0
27 Feb 2013 17:51:36,906 INFO  [main] 
(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
Updating log-3.meta currentPosition = 0, logWriteOrderID = 1361955096886
27 Feb 2013 17:51:36,908 INFO  [main] 
(org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  - 
Updating log-4.meta currentPosition = 4095, logWriteOrderID = 1361955096886



Some diagnostics:

# lsof +d /tmp/flume-data
COMMAND   PID            USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
bash    15144 juhani_connolly  cwd    DIR  252,0     4096 132605 /tmp/flume-data
sudo    16392            root  cwd    DIR  252,0     4096 132605 /tmp/flume-data
lsof    16394            root  cwd    DIR  252,0     4096 132605 /tmp/flume-data
lsof    16395            root  cwd    DIR  252,0     4096 132605 /tmp/flume-data


Attaching thread dump
                
> CheckpointRebuilder main method does not work
> ---------------------------------------------
>
>                 Key: FLUME-1929
>                 URL: https://issues.apache.org/jira/browse/FLUME-1929
>             Project: Flume
>          Issue Type: Bug
>            Reporter: Hari Shreedharan
>            Assignee: Hari Shreedharan
>            Priority: Minor
>         Attachments: FLUME-1929.patch
>
>
> Based on the discussion in this thread: 
> http://apache.markmail.org/thread/567cshrmz35okrq3 - the main method in 
> CheckpointRebuilder was not updated for the new data file format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to