Hi Eliot, maybe RunArray helps here?
>From the class comment of RunArray (in Pharo): My instances provide space-efficient storage of data which tends to be constant over long runs of the possible indices. Essentially repeated values are stored singly and then associated with a "run" length that denotes the number of consecutive occurrences of the value. Cheers Matthias 2010/2/9 Eliot Miranda <[email protected]>: > Hi All, > I've just needed to make sense of a very long log file generated by > strace. The log file is full of entries like: > --- SIGALRM (Alarm clock) @ 0 (0) --- > gettimeofday({1265744804, 491238}, NULL) = 0 > sigreturn() = ? (mask now []) > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > and my workspace script reduces these to e.g. > --- SIGALRM (Alarm clock) @ 0 (0) --- > gettimeofday({1265744797, 316183}, NULL) = 0 > sigreturn() = ? (mask now []) > NEXT 2 LINES REPEAT 715 TIMES > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > --- SIGALRM (Alarm clock) @ 0 (0) --- > gettimeofday({1265744797, 317189}, NULL) = 0 > sigreturn() = ? (mask now []) > > My question is has anyone looked at this issue in any depth and perhaps come > up with something not as crude as the below and possibly even recursive. > i.e. the above would ideally be reduced to e.g. > NEXT 7 LINES REPEAT 123456 TIMES > --- SIGALRM (Alarm clock) @ 0 (0) --- > gettimeofday({1265744797, 316183}, NULL) = 0 > sigreturn() = ? (mask now []) > NEXT 2 LINES REPEAT BETWEEN 500 AND 800 TIMES > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > ioctl(8, 0x80045530, 0xbfd4fe70) = 0 > ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 > --- SIGALRM (Alarm clock) @ 0 (0) --- > gettimeofday({1265744797, 317189}, NULL) = 0 > sigreturn() = ? (mask now []) > > > Here's my quick hack that I ran in vw7.7nc: > | f o lines maxrun repeats range | > f := '../Cog/squeak.strace.log' asFilename readStream. > o := 'compressed.log' asFilename writeStream. > lines := OrderedCollection new. > maxrun := 50. > repeats := 0. > range := nil. > [[f atEnd] whileFalse: > [lines size > maxrun ifTrue: > [repeats > 0 > ifTrue: > [1 to: range first - 1 do: > [:i| o nextPutAll: (lines at: i); cr]. > o nextPutAll: 'NEXT '; print: range size; nextPutAll: ' LINES REPEAT '; > print: repeats + 1; nextPutAll: ' TIMES'; cr. > range do: > [:i| o nextPutAll: (lines at: i); cr]. > lines removeFirst: range last. > repeats := 0] > ifFalse: > [o nextPutAll: lines removeFirst; cr; flush]. > range := nil]. > lines addLast: (f upTo: Character cr). > [:exit| > 1 to: lines size do: > [:i| | line repeat | > line := lines at: i. > repeat := lines nextIndexOf: line from: i + 1 to: lines size. > (repeat ~~ nil > and: [lines size >= (repeat - i * 2 + i) > and: [(i to: repeat - 1) allSatisfy: [:j| (lines at: j) = (lines at: j - i + > repeat)]]]) ifTrue: > [repeats := repeats + 1. > range isNil > ifTrue: [range := i to: repeat - 1] > ifFalse: > [range = (i to: repeat - 1) ifTrue: > [range do: [:ignore| lines removeAtIndex: repeat]. > exit value]]]]] valueWithExit]] > ensure: [f close. o close]. > repeats > Forgive the cross post. I expect deep expertise in each newsgroup posted > to. > best > Eliot > _______________________________________________ > Pharo-project mailing list > [email protected] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
