Re: [Lazarus] A sort program in rpsrt102.zip (1).

2008-08-26 Thread JoshyFun
Hello Mehmet,

Tuesday, August 26, 2008, 7:35:00 AM, you wrote:

MES (A)
MES An algorithm ( http://en.wikipedia.org/wiki/Algorithm )
MES and its implementation are different concepts , at least
MES in the sense that , for an algorithm there may be many
MES distinct implementations of it .

Yes, you are right, but a good implementation of the same algorithm
does not provide a big bunch of gain usually. Different algorithms are
great for different sort tasks and QuickSort is, maybe, the best for
the general case.

MES (B)
MES RPSORT program itself is written in 16-bit 8086 assembler
MES but its bit size is not related to the file size it sorts .

Yes, but it matters in how it handle the data, as it can not deal with
more than 64 KB in each bunch, also you should not mix 16 bit and 32
bit assembler if possible. Anyway porting it to 32 bits (the algorithm
implementation, not the full code) should be quite easy.

MES (C)
MES Personally , I am not writing assembler programs , and
MES I do not know 32-bit assembly programming .
MES Therefore , I am not able to convert it to a 32-bit
MES assembly program .
MES I think a programmer fluent in assembly language programming
MES can do it . I am not able to evaluate its difficultyness but
MES it is my opinion that many statements may be converted by replace
MES operations .
MES After this it may be used as embedded into a Pascal procedure or
MES may be used as a spawned process to sort a file .

The problem is that assembler is not portable. Also a 32 bit pure
Pascal implementation should be even faster than RPSRT.

MES (D)
MES To make a proper reply to your question I prepared a simple program
MES compiled with FPC
MES to demonstrate its sorting speed as follows :

Great, as both have the same computer ;) except that I have 1 GB. So I
wrote a routine to sort it (the 10 000 000 lines file) with Lazarus
using the already provided primitives, nothing special programmed. I
had used your program to generate the test file.

After 4 runs the average time to sort the file:

Number of records File size   TemporalTime required for
as given for Mas MegaBytesdisk space  sorting HH:MMM:SS
---   --  -
 10 000 000   220.0 0.00  00:03:18
  5 000 000   110.0 0.00  00:01:23

--
procedure TForm1.Button1Click(Sender: TObject);
var
  L: TStringList;
  StartTime: TDateTime;
begin
  StartTime:=Now;
  L:=TStringList.Create;
  L.LoadFromFile('G:\RNumbers.txt');
  L.Sort;
  L.SaveToFile('G:\sorted.txt');
  L.Free;
  self.Caption:=TimeToStr(Now-StartTime);
end;
--

It does not beat RPSRT, so I decided to optimize it a bit, preparing
the TStringList to receive 20 million lines at least, which
preallocate 20 million pointer positions, and not freeing the memory
as the OS will do it in one step.

Number of records File size   TemporalTime required for
as given for Mas MegaBytesdisk space  sorting HH:MMM:SS
---   --  -
 10 000 000   220.0 0.00  00:02:53
  5 000 000   110.0 0.00  00:01:18

--
procedure TForm1.Button1Click(Sender: TObject);
var
  L: TStringList;
  StartTime: TDateTime;
begin
  StartTime:=Now;
  L:=TStringList.Create;
  L.Capacity:=2000;
  L.LoadFromFile('G:\RNumbers.txt');
  Beep;
  L.Sort;
  Beep;
  L.SaveToFile('G:\sorted.txt');
  Beep;
  self.Caption:=TimeToStr(Now-StartTime);
  L.Free; //Free time is not counted as RPSRT will not free 20 million pointers.
end;
--

Also disk write is not optimized in any way, so it could mean (not
checked in the code) 5/10 million writes to disk.

This code quite sure will explode with 2 Gigabytes file, and will take
ages to complete

MES Number of records File size   Time required for sorting
MES as given for Mas MegaBytesas Hour : Minute : Second .
MES ---   ---
MES only + Command Prompt working :
MES  5 000 000   110.0 00 : 01 : 11.24
MES 10 000 000   220.0 00 : 02 : 28.41
MES 25 000 000   550.0 00 : 06 : 29.47

MES100 000 000  2200.0 Locked :
MES   I think file size should be less than 2147.0
MES MegaBytes ,
MES   because of 32-bit MaxInt size ( 2 147 ... ... ) .
MES   It is locked after 2053 MegaBytes of its
MES temporary file
MES   left behind after its locking .
MES90 000 000 1980.0 01 : 32 : 45.98

Yes, 2 gigabytes will be a serious problem to RPSRT due file access
routines, 

[Lazarus] A sort program in rpsrt102.zip (1).

2008-08-25 Thread Mehmet Erol Sanliturk


Dear JoshyFun ,


(A)

An algorithm ( http://en.wikipedia.org/wiki/Algorithm )
and its implementation are different concepts , at least
in the sense that , for an algorithm there may be many
distinct implementations of it .


(B)

RPSORT program itself is written in 16-bit 8086 assembler
but its bit size is not related to the file size it sorts .


(C)
Personally , I am not writing assembler programs , and
I do not know 32-bit assembly programming .
Therefore , I am not able to convert it to a 32-bit
assembly program .
I think a programmer fluent in assembly language programming
can do it . I am not able to evaluate its difficultyness but
it is my opinion that many statements may be converted by replace
operations .

After this it may be used as embedded into a Pascal procedure or
may be used as a spawned process to sort a file .


(D)
To make a proper reply to your question I prepared a simple program
compiled with FPC
to demonstrate its sorting speed as follows :


 
Program Numbers ;
 
Var T : Text ;
 
Var M : LongInt ;
Var i : LongInt ;
 
Var p : Int64 ;
Var q : Int64 ;
 
Begin
 
M := 0 ;
 
Writeln ( 'Give number of values to generate :' ) ;
Readln ( M ) ;
 
If ( M  0 )
Then
Begin
 
Assign ( T , 'RNumbers.TXT' ) ;
ReWrite ( T ) ;
 
p := 9223372036854775807 ;
 
For i := 1 to M
Do
  Begin
 
  q := Random ( p )  ;
 
  Writeln ( T , q : 20 ) ;
 
  End ;
 
Close ( T ) ;
 
End
Else
Begin
Writeln ( 'Given value : ' , M , ' is NOT greater than zero .' ) ;
End ;
 
End .


 
Times are obtained in a PC
with Intel Pentium 4 CPU 3.00 GHz
   2 GB Memeory
and Windows XP Home Edition .


(E)

For each value of M as number of records ,
the following command has been applied :

   rpsort  rnumbers.txt  snumbers.txt

The RPSORT.COM is reporting the time used for sorting which
I checked by an actual clock and I have found them to be correct .

The results are as follows :


Number of records File size   Time required for sorting
as given for Mas MegaBytesas Hour : Minute : Second .
---   ---

  10 000  0.214 00 : 00 : 00.11 ( + Six 
program working )
100 000  2.2 00 : 00 : 01.43 ( + Six 
program working )
 1 000 00022.0 00 : 00 : 17.19 ( + Six 
program working )
 5 000 000  110.0 00 : 01 : 36.61 ( + Six 
program working )
   10 000 000  220.0 00 : 03 : 21.41 ( + Six program 
working )




only + Command Prompt working :

  5 000 000   110.0 00 : 01 : 11.24
10 000 000   220.0 00 : 02 : 28.41
25 000 000   550.0 00 : 06 : 29.47
  100 000 000 2200.0 Locked :

  I think file size should be less than 2147.0 
MegaBytes ,
  because of 32-bit MaxInt size ( 2 147 ... ... ) .
  It is locked after 2053 MegaBytes of its 
temporary file
  left behind after its locking .

   90 000 000 1980.0 01 : 32 : 45.98



When M = 10 000 000 :

 snumbers.txt which is already sorted
 is sorted as

   rpsort  snumbers.txt  tnumbers.txt


in time 00 : 01 : 50.84 .


 snumbers.txt which is already sorted in ascending order
 is sorted in descending order as


   rpsort /R  snumbers.txt  tnumbers.txt
 

in time 00 : 01 : 50.35 .


(F)
It is possible to generate multiple column files and
test it for multiple sort keys .
Details may be found in its documentation .



Thank you very much ,

Mehmet Erol Sanliturk

___
Lazarus mailing list
Lazarus@lazarus.freepascal.org
http://www.lazarus.freepascal.org/mailman/listinfo/lazarus


[Lazarus] A sort program in rpsrt102.zip .

2008-08-24 Thread Mehmet Erol Sanliturk

Dear Sirs ,


There is a program named as rpsort.com with its related sources
in rpsrt102.zip which may be found in the following address :

http://www.simtel.net/product.php?url_fb_product_page=51643

 or by an internet search with name   rpsrt102.zip  .


The following is a part from its readme :

--

This is being mailed to all those who either registered or inquired
  about RPSORT, written by my brother, Robert Pirko.


Robert Pirko, the author of RPSORT,
passed away on February 15, 1992.
He had not been ill , but just
suddenly collapsed and died.
You never know ...  


  RPSORT is now free for all to use, no payment is asked.  Also the
assembly language source code is free.

  The file RPSRT101.ZIP has been uploaded to a local (New York City) 
BBS:
 
NYACC 718 539-3064  (free access)

  This file contains version 1.01 of RPSORT and its documentation
plus the commented source code written in 8086 assembler for the
Borland Turbo Assembler 2.5.

  It is also available by mail for a payment of $5.00 to:

Alex Pirko
3881 Sedgwick Ave
Apt.6D
Bronx, NY 10463


--


rpsort.com is working perfectly under Windows XP ( with 8.3 format file 
names
( because its file name variable ( I think , due to IpNameExt DB 13 Dup 
(?) in its
  assembler source ) ) ) .


It is UNBELIEVABLY fast .

My suggestion is that it can be incorporated into a unit
( for example ,
as a procedure with a parameter
( containing giving definition of sort like its command-line parameter 
string ) ).

My opinion is that it may be useful to study it .


Thank you very much ,

Mehmet Erol Sanliturk
___
Lazarus mailing list
Lazarus@lazarus.freepascal.org
http://www.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] A sort program in rpsrt102.zip .

2008-08-24 Thread JoshyFun
Hello Mehmet,

Sunday, August 24, 2008, 10:07:12 PM, you wrote:


MES It is UNBELIEVABLY fast .
MES My suggestion is that it can be incorporated into a unit
MES ( for example ,
MES as a procedure with a parameter
MES ( containing giving definition of sort like its command-line parameter
MES string ) ).
MES My opinion is that it may be useful to study it .

Which is special in this software ? It is using the well known
QuickSort algorithm, as it is assembler it should run a faster
than a pure Pascal implementation, also it is 16 bits asm code.

It will be quite difficult to compare in speed with a FPC
implementation as the binary of RPSRT is .com file of 18KBs ;) and
will be loaded very fast (no rellocation needed), compared to a FPC
.exe binary.

Maybe you saw something that I was not aware about it :-?

-- 
Best regards,
 JoshyFun

___
Lazarus mailing list
Lazarus@lazarus.freepascal.org
http://www.lazarus.freepascal.org/mailman/listinfo/lazarus