I've been working on my first Smalltalk program which needs to read and write large c structs from a binary file. I wrote two classes BinaryStreamReader and BinaryStreamWriter that take a stream and can read (or write) all of the integer and floating point types I need (also handles byte-swapping if necessary). I wrote a test program that focuses on just reading a small (for us) 123 Mb data file on disk. The program takes about 166 seconds to run compared to 1.2 seconds for an equivalent C version (140x faster than Squeak version).
As an example of the style of code I've written, here is the method that reads an unsigned 32-bit integer: uint32 " returns the next unsigned, 32-bit integer from the binary stream " " see PositionableStream for original implimentation." | n a b c d | isBigEndian ifTrue: [ a := stream next. b := stream next. c := stream next. d := stream next ] ifFalse: [ d := stream next. c := stream next. b := stream next. a := stream next ]. ((((a notNil and: [ b notNil ]) and: [ c notNil ])) and: [ d notNil]) ifTrue: [ n := a. n := (n bitShift: 8) + b. n := (n bitShift: 8) + c. n := (n bitShift: 8) + d ] ifFalse: [ n := nil ]. ^ n There are at 4 calls to stream next for each integer and sure enough, a profile of the code (attached below) shows that most of the time is being lost in the StandardFileStream basicNext and next methods. There must be a better way to do this. Scaled up to operational code, I will need to process about 40 Gb of data per day. My C code currently takes about 16 cpu hours to do this work (including number crunching). In Squeak, just reading the data would take 3 cpu months! Hopefully, someone can help me out here. The working code is available on squeaksource.org if anyone is interested: http://www.squeaksource.com/@CWlm_vX4hAPUzk5w/7SVjQQhp Thanks, David Below is a message tally of my program: - 166088 tallies, 166100 msec. **Tree** 100.0% {166100ms} SEAFileReader>>printAllBlocks 99.9% {165934ms} ProcessedPingBlock>>readFrom: 99.9% {165934ms} XYZAPingData>>readFrom: 99.7% {165602ms} XYZATransducerData>>readFrom: 95.9% {159290ms} XYZAPointData>>readFrom: 46.4% {77070ms} BinaryStreamReader>>double |41.9% {69596ms} BinaryStreamReader>>uint32 | |28.1% {46674ms} StandardFileStream>>next | | |14.1% {23420ms} primitives | | |14.0% {23254ms} StandardFileStream>>basicNext | |9.8% {16278ms} LargePositiveInteger>>+ | | |6.1% {10132ms} LargePositiveInteger(Integer)>>+ | | | |3.1% {5149ms} primitives | | | |3.0% {4983ms} SmallInteger(Number)>>negative | | |3.7% {6146ms} primitives | |4.1% {6810ms} primitives |2.5% {4153ms} Float class(Behavior)>>new: |2.0% {3322ms} primitives 13.9% {23088ms} BinaryStreamReader>>float |10.4% {17274ms} BinaryStreamReader>>uint32 | |7.0% {11627ms} StandardFileStream>>next | | |3.5% {5814ms} primitives | | |3.5% {5814ms} StandardFileStream>>basicNext | |2.4% {3986ms} LargePositiveInteger>>+ |2.2% {3654ms} Float class>>fromIEEE32Bit: 13.7% {22756ms} BinaryStreamReader>>int32 |7.7% {12790ms} BinaryStreamReader>>uint32 | |6.8% {11295ms} StandardFileStream>>next | | 3.5% {5814ms} StandardFileStream>>basicNext | | 3.4% {5647ms} primitives |5.2% {8637ms} SmallInteger>>>= | 4.3% {7142ms} SmallInteger(Magnitude)>>>= | 3.5% {5814ms} SmallInteger>>< | 2.6% {4319ms} SmallInteger(Integer)>>< 10.7% {17773ms} BinaryStreamReader>>uint16 |6.9% {11461ms} StandardFileStream>>next | |3.5% {5814ms} StandardFileStream>>basicNext | |3.3% {5481ms} primitives |3.8% {6312ms} primitives 6.8% {11295ms} BinaryStreamReader>>skip: |5.0% {8305ms} StandardFileStream>>skip: 3.4% {5647ms} BinaryStreamReader>>int8 2.6% {4319ms} BinaryStreamReader>>uint8 **Leaves** 25.4% {42189ms} StandardFileStream>>basicNext 25.2% {41857ms} StandardFileStream>>next 6.0% {9966ms} BinaryStreamReader>>uint32 5.6% {9302ms} SmallInteger(Number)>>negative 4.6% {7641ms} LargePositiveInteger>>+ 3.8% {6312ms} LargePositiveInteger(Integer)>>+ 3.8% {6312ms} BinaryStreamReader>>uint16 3.4% {5647ms} Float class(Behavior)>>new: 2.0% {3322ms} BinaryStreamReader>>double **Memory** old +3,705,004 bytes young -28,800 bytes used +3,676,204 bytes free +362,744 bytes **GCs** full 50 totalling 2,524ms (2.0% uptime), avg 50.0ms incr 19959 totalling 2,794ms (2.0% uptime), avg 0.0ms tenures 6,041 (avg 3 GCs/tenure) root table 0 overflows -- David Finlayson, Ph.D. Operational Geologist U.S. Geological Survey Pacific Science Center 400 Natural Bridges Drive Santa Cruz, CA 95060, USA Tel: 831-427-4757, Fax: 831-427-4748, E-mail: [EMAIL PROTECTED] _______________________________________________ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners