At 02:27 �� 28/3/2002, you wrote: >On 28/03/02 at 13:37 Phoebus Dokos wrote: > > >Hi all, > >I was wondering if anyone has tried moving screen memory around using a > >lookup table instead of calculating the memory position of each pixel > >every time... > >The question is: what is so difficult about calculating a pixel position? >And, more importantly, where did you get the idea that rectangular >graphical objects are moved around (or indeed bitblt operations are >performed) pixel by pixel?
In my case where you use aligned blocks on the screen (ie Pixel column stays the same while the row changes (In essence therefore moving around a block (not contiguous physical memory block but rather a "virtual block"), the calculations can take a lot of time, especially as numbers increase. In essence I believe the following operations have to be performed for what I want to do... 1. feed the pixel values to your routine (lets assume Mode 32 or 33) 2. Get the SCR_BASE (You HAVE to get it every single time as from what Marcel tells me the screen base can change esp. on QPC - I have no info on Qx0's and the Aurora is not the case here) 3. Calculate The start address, then length, then recalculate the start for the next line etc... (Not very slow when starting at the beginning but can get painfully slow (esp. in S*Basic when having an offset) 4. Grab each line, do what you want with it 5. Go back to 3 to recalculate for the next line (see no. 3... I listed it twice there :-) 6. Finally return In my method I believe things are simpler 1. Feed the memory address value via the table to your memory move routine 2. Make a simple subtraction to find the length 3. Move. 4. Repeat It's one calculation instead of at least 4. (Don't forget you have to make at least 3 multiplications whereas with the second method only one subtraction) >If at all possible, you ALWAYS move the maximum >unit of data the CPU can move, which in this case is a long word, or even >several long words on systems capable of loop mode and burst acces. In case >of non-aligned pixels aligned ones and non-aligned ones are calculated >separately. >Also, pixels to the right are a constant bit number apart, which means >calculating successive pixel addresses is trivial. Pixels down are also >easily addressed by adding a line length offset. See above why I think it should be faster. BTW for a Mode 33 or 32 you need to move a constant amount of words. > >That table would be an X (the x coordinate) by Y (the y coordinate) and in > > >each x by y position, the screen memory address that this corresponds to > >would be contained (and maybe the alternate memory location for >swapping)... > >OK, for a 32-bit pixel address, and a max 1024x1024 screen a table takes >4MB. Two alternate addresses, 8MB. Assuming an address is an offset to a >fixed screen address, 20 bits are needed to describe it, which is not a >multiple of any convenient CPU data format (byte, word, long), so we are >now into calculating positions in a table and masking off excess bits - >sounds familiar? Just change positions in table with positions on screen. >The table is still 2.5MB. If we go the other way around, and map pixels of >the graphical object rather than the screen, each graphical object would >get 4 bytes of a table for each pixel - not counting the pixel itself - for >a 32-bit address, 3 bytes for a 24-bit address with problems on access, and >finally 2.5 bytes for a 20 bit address with even bigger problems on access. >Considering that the deepest pixel is currently 2 bytes, it's at the very >least a 125% inflation in size of any object. Yes but for my specifications (partial areas where a lot of moves are needed will be calculated) so the resulting tables would be significantly less and in any case the alternate address can be made with one addition (a lot faster than multiplying three times adding and then adding the result to get the new address). Inflation of size CAN be a problem, however for non-FPU equipped machines the result would be a lot faster I still believe.. >All of these are completely trivial calculations. I am sure this concept >could therefore have been evaluated for practicality (and discarded!) >before proposing it to the general public. Trivial sure, make all those a 100 times over and suddenly they are not trivial anymore :-) (Especially where fast shifting of data around is needed). > >Surely this wastes memory > >This is about the same understatement as saying you are slightly hot >standing in the middle of a bonfire Depends where you come from ;-) (Hehe)- Seriously look above as I refer to partial areas and in any case for QPC and Qx0 memory is not the issue :-) > > but I have the feeling that this could SUBSTANTIALLY accelerate "bitblt" > >Not even the slightest - accessing anything outside the CPU is many times >slower than calculating with values in registers. This is why you keep your >accesses to memory as limited as possible, and when you make them, you move >as much data as possible in a single access. If you noticed I had the bitblt within quotes.... ;-). Yes indeed its faster to do less bigger moves than more small... (IIRC this is limited to a certain size though after which the operations get slower) >Nasta Phoebus
