Re: grep a binary file

m ike Mon, 12 Sep 2005 23:51:17 -0700

untried edit:
# dd bs=${jpgsize}c cbs=1c skip=${begin_offset}c count=1 if=CF_work of=$fn





On 9/12/05, m ike <[EMAIL PROTECTED]> wrote:
> 
> #############################################################
> # 2005-09-12
> # This is a worksheet that I developed to dissect intact 
> # all 116 jpgs from about 282MB of a reformatted and 
> # partially overwriten 512MB CF card (Olympus c5050)
> #
> # bed is a binary editor.
> #
> # The basic approach is to use bed's global regexp find/replace 
> # to replace the several bytes preceding each FF D8 (beginning 
> # marker of a jpg) with an end-of-line and a unique text string. 
> # save it. do the same for the FF D9 (closing jpg marker). 
> # grep -a -b the two modified files for the carrot and unique 
> # string, ignoring everyother (which is the embedded thumbnail). 
> # Process original file using dd and adjusted byte offsets.
> # 
> # This page got me started (thanks TsuruZoh Tachibanaya):
> #
> # http://www.media.mit.edu/pia/Research/deepview/exif.html
> #
> # Here is bed (thanks Jaap Korthals Altes):
> #
> # http://bedlinux.tripod.com/
> #
> # A hexadecimal table
> #
> # http://www.prepressure.com/library/binhex.htm
> #
> # i'm at: [EMAIL PROTECTED] or z23r751gmail.com<http://z23r751gmail.com>
> #
> ###############################################################
> 
> 
> ########## MAKE A WORKING COPY OF THE REFORMATTED PARTITION 
> cat /dev/sda1 > CF
> 
> # or use:
> # dd if=/dev/sda1 of=dd_of_CF
> 
> 
> ########## FIND BYTE OFFSET OF THE FIRST JPG RESIDING IN THE 
> ########## NOT-OVERWRITTEN HALF OF THE CARD
> # open file in bed
> # search in ascii mode for "EExif"
> # observe the exif date/time
> # locate first jpg in the non-overwritten section
> # FF D8 marks the jpg's beginning
> # move the cursor to that FF
> # note the byte offset as displayed in the lower right 
> # corner of bed window
> 
> 
> 
> ########## MAKE A SMALLER FILE TO WORK WITH
> # calculate the number of bytes in the file that 
> # follow the offset, add 6 bytes
> # filesize - byteoffset + 1 + 6
> 
> tail -c 287271942 CF > CF_work
> 
> 
> 
> 
> ########## MAKE FILES GREP-ABLE for FF D8 and FF D9
> 
> # verify that grep does not find carrot-BEGIN in the file
> grep ^BEGIN CF_work
> 
> # open it in bed
> bed CF_work
> 
> # open bed's Replace dialog (under Alt-e)
> # select Reg Expr, digit Base 16
> # Find: .. .. .. .. .. .. FF D8
> # Replace: 0A 42 45 47 49 4E FF D8
> # press enter; choose all; takes a couple minutes to complete
> # use 'Save as' to save the edited file (to CF_begin_1)
> #
> # note: first invocation of a find/replace in bed is 
> # buggy. stop the invocation, then start it again.
> #
> # repeat for FF D9
> grep ^CLOSE CF_work
> # Find: FF D9 .. .. .. .. .. ..
> # Replace: FF D9 0A 43 4C 4F 53 45
> # save edited version (to CF_close_1)
> 
> 
> 
> ########## GRAB IMPORTANT LINES:
> 
> grep -a -b ^BEGIN CF_begin_1 | strings | grep BEGIN > CF_begin_2
> 
> grep -a -b ^CLOSE CF_close_1 | strings | grep CLOSE > CF_close_2
> 
> wc CF_begin_2
> # 232 232 3609 CF_begin_2
> 
> wc CF_close_2
> # 2548 2550 40829 CF_close_2
> 
> # note the execess occurances of CLOSEs left over from previous 
> # uses of the card. these excess CLOSEs occur (?) at the end of 
> # the CF card, and in the not-overwritten space between the jpgs
> #
> 
> 
> 
> ########## GRAB EVERY OTHER LINE, BEGINNING WITH 1ST OR 2ND LINE
> 
> cat CF_begin_2 | sed -n '1~2p' > CF_begin_2r
> 
> wc CF_begin_2r
> #116 116 1803 CF_begin_2r
> 
> # looks like 116 will be recovered !!
> 
> 
> 
> 
> ############ INSPECT THE BYTE-DISTANCE BETWEEN SUCCESSIVE BEGINS
> 
> old_offset=0
> n_begin=0
> for i in `cat CF_begin_2`; do
> (( n_begin += 1 ))
> new_offset=${i/\:BEGIN*/} 
> distance=$(( new_offset - old_offset ))
> # if [ $distance -lt 4096 ]; then
> printf "%4d " $n_begin 
> printf "%10d %10d " $new_offset $old_offset
> printf "%10d\n" $distance
> # fi;
> old_offset=$new_offset
> done
> ###
> ### everything looks good so far (every other distance is 4096)
> ###
> 
> 
> 
> ############ DELETE BY HAND ANY CHARACTERS (E.G. GREATERTHAN) 
> ############ THAT WILL CAUSE THE FOLLOWING SCRIPT TO FAIL
> xemacs CF_close_2 &
> 
> 
> 
> 
> ############ CALCULATE EACH JPG'S BEGINNING BYTE-OFFSET 
> ############ AND IT'S FILE SIZE
> #
> # if this code runs smoothly, uncomment the CPU intensive 
> # dd command and run it again to dissect out the jpgs.
> #
> old_close_offset=0
> n_begin=0
> for i in `cat CF_begin_2r`; do 
> (( n_begin += 1 ))
> offset=${i/\:BEGIN*/}; 
> begin_offset=$(( offset + 5 ))
> found=0;
> n_close=0
> for j in `cat CF_close_2`; do 
> (( n_close += 1 ))
> offset=${j/\:CLOSE*/}; 
> close_offset=$(( offset - 1 )) 
> if [ $close_offset -gt $begin_offset ]; then 
> if [ $found -gt 0 ]; then 
> break;
> fi; 
> found=$(( found + 1 ));
> fi; 
> done;
> jpg_size=$(( $close_offset - $begin_offset ))
> gap_size=$(( $begin_offset - $old_close_offset ))
> fn=`printf "recovered_%04d.jpg" $n_begin`
> printf "%12s " $fn
> printf "%5d %5d " $n_begin $n_close
> printf "%10d %10d " $gap_size $jpg_size
> printf "%10d %10d\n" $begin_offset $close_offset 
> old_close_offset=$close_offset
> # dd bs=1c cbs=1c skip=${begin_offset}c count=${jpgsize}c if=CF_work 
> of=$fn
> done
> 
> 
> 
>

--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list

Re: grep a binary file

Reply via email to