The problem is that (#(1 2 1 2 1 2) as: RunArray) does not save any space...
Nicolas 2010/2/10 Matthias Berth <[email protected]>: > Hi Eliot, > > maybe RunArray helps here? > > From the class comment of RunArray (in Pharo): > My instances provide space-efficient storage of data which tends to be > constant over long runs of the possible indices. Essentially repeated > values are stored singly and then associated with a "run" length that > denotes the number of consecutive occurrences of the value. > > Cheers > > Matthias > > 2010/2/9 Eliot Miranda <[email protected]>: >> Hi All, >> I've just needed to make sense of a very long log file generated by >> strace. The log file is full of entries like: >> --- SIGALRM (Alarm clock) @ 0 (0) --- >> gettimeofday({1265744804, 491238}, NULL) = 0 >> sigreturn() = ? (mask now []) >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> and my workspace script reduces these to e.g. >> --- SIGALRM (Alarm clock) @ 0 (0) --- >> gettimeofday({1265744797, 316183}, NULL) = 0 >> sigreturn() = ? (mask now []) >> NEXT 2 LINES REPEAT 715 TIMES >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> --- SIGALRM (Alarm clock) @ 0 (0) --- >> gettimeofday({1265744797, 317189}, NULL) = 0 >> sigreturn() = ? (mask now []) >> >> My question is has anyone looked at this issue in any depth and perhaps come >> up with something not as crude as the below and possibly even recursive. >> i.e. the above would ideally be reduced to e.g. >> NEXT 7 LINES REPEAT 123456 TIMES >> --- SIGALRM (Alarm clock) @ 0 (0) --- >> gettimeofday({1265744797, 316183}, NULL) = 0 >> sigreturn() = ? (mask now []) >> NEXT 2 LINES REPEAT BETWEEN 500 AND 800 TIMES >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> ioctl(8, 0x80045530, 0xbfd4fe70) = 0 >> ioctl(8, 0xc1205531, 0xbfd4fb80) = 0 >> --- SIGALRM (Alarm clock) @ 0 (0) --- >> gettimeofday({1265744797, 317189}, NULL) = 0 >> sigreturn() = ? (mask now []) >> >> >> Here's my quick hack that I ran in vw7.7nc: >> | f o lines maxrun repeats range | >> f := '../Cog/squeak.strace.log' asFilename readStream. >> o := 'compressed.log' asFilename writeStream. >> lines := OrderedCollection new. >> maxrun := 50. >> repeats := 0. >> range := nil. >> [[f atEnd] whileFalse: >> [lines size > maxrun ifTrue: >> [repeats > 0 >> ifTrue: >> [1 to: range first - 1 do: >> [:i| o nextPutAll: (lines at: i); cr]. >> o nextPutAll: 'NEXT '; print: range size; nextPutAll: ' LINES REPEAT '; >> print: repeats + 1; nextPutAll: ' TIMES'; cr. >> range do: >> [:i| o nextPutAll: (lines at: i); cr]. >> lines removeFirst: range last. >> repeats := 0] >> ifFalse: >> [o nextPutAll: lines removeFirst; cr; flush]. >> range := nil]. >> lines addLast: (f upTo: Character cr). >> [:exit| >> 1 to: lines size do: >> [:i| | line repeat | >> line := lines at: i. >> repeat := lines nextIndexOf: line from: i + 1 to: lines size. >> (repeat ~~ nil >> and: [lines size >= (repeat - i * 2 + i) >> and: [(i to: repeat - 1) allSatisfy: [:j| (lines at: j) = (lines at: j - i + >> repeat)]]]) ifTrue: >> [repeats := repeats + 1. >> range isNil >> ifTrue: [range := i to: repeat - 1] >> ifFalse: >> [range = (i to: repeat - 1) ifTrue: >> [range do: [:ignore| lines removeAtIndex: repeat]. >> exit value]]]]] valueWithExit]] >> ensure: [f close. o close]. >> repeats >> Forgive the cross post. I expect deep expertise in each newsgroup posted >> to. >> best >> Eliot >> _______________________________________________ >> Pharo-project mailing list >> [email protected] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [email protected] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
