Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
But now another issue is coming up: If I increase the number of records to 40,000 population of records slows down after about 10,000 records, speeds up again, and comes to an apparant stand-still for 32900 records. After waiting some time the record counter (which is incremented in steps of 100 in my demo) goes up to 33000. Then I gave up. Try call MergeChangeLog regulary on every 1000 rows for example. If does not help, attach your test program, so we can reproduce ... Yes, this is the solution. Thank you. MergeChangeLog definitely should be documented in a better way. You can report bug report about it or add it to wiki yourself - There is http://wiki.freepascal.org/TBufDataset but it seems that there is only one line of text ;-) - in FCL documentation http://www.freepascal.org/docs-html/current/fcl/db/index.html I can not find TBufDataset at all L. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
Am 27.03.2017 um 10:59 schrieb LacaK via Lazarus: But now another issue is coming up: If I increase the number of records to 40,000 population of records slows down after about 10,000 records, speeds up again, and comes to an apparant stand-still for 32900 records. After waiting some time the record counter (which is incremented in steps of 100 in my demo) goes up to 33000. Then I gave up. Try call MergeChangeLog regulary on every 1000 rows for example. If does not help, attach your test program, so we can reproduce ... Yes, this is the solution. Thank you. MergeChangeLog definitely should be documented in a better way. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
Try call FExportDataset.MergeChangeLog before: WriteLn('Saving...'); Does anything in your timing changed ? Ah - that's it. TBufDataset saves the records instantly now. Probably, this should go into the official wiki site for TBufDataset. But now another issue is coming up: If I increase the number of records to 40,000 population of records slows down after about 10,000 records, speeds up again, and comes to an apparant stand-still for 32900 records. After waiting some time the record counter (which is incremented in steps of 100 in my demo) goes up to 33000. Then I gave up. Try call MergeChangeLog regulary on every 1000 rows for example. If does not help, attach your test program, so we can reproduce ... L. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
On Sunday 26 March 2017 23:53:08 Werner Pamler via Lazarus wrote: > Trying to extend the import/export example of fpspreadsheet from a dBase > table to a TBufDataset I came across this issue with TBufDataset: While > data are posted to the database as quickly as usual writing to file > takes extremely long if there are more than a few thousand records. > > Run the demo attached below. On my system, I measure these (non-linearly > scaling) execution times for writing the TBufDataset table to file: > > 1000 records -- 0.9 seconds > 2000 records -- 8.8 seconds > 3000 records -- 31.1 seconds > etc. > > Compared to that, writing of the same data to a dbf file is a wink of an > eye. Is there anything which I am doing wrong? Or should I report a bug? > Can you switch off 'applyupdate'-functionality in TBufdataset? MSEgui TLocalDataset (a fork of FPC TBufDataset) writes 1'000'000 records in about 0.4 seconds if options bdo_noapply is set. " 100: 0.313s 100: 0.308s 100: 0.319s 100: 0.311s 100: 0.411s 100: 0.293s 100: 0.327s 100: 0.321s 3000: 0.001s 3000: 0.001s 3000: 0.001s " " procedure tmainfo.recev(const sender: TObject); var i1: int32; t1: tdatetime; begin locds.active:= false; locds.disablecontrols(); try locds.active:= true; for i1:= 1 to reccount.value do begin locds.appendrecord([i1,inttostrmse(i1)+'abcdefghiklmnop',10*i1]); end; t1:= nowutc(); locds.savetofile('test.db'); t1:= nowutc()-t1; writeln(reccount.value,': ',formatfloatmse(t1*60*60*24,'0.000s')); locds.active:= false; finally locds.enablecontrols(); end; end; " Martin -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
Am 27.03.2017 um 10:13 schrieb LacaK via Lazarus: Try call FExportDataset.MergeChangeLog before: WriteLn('Saving...'); Does anything in your timing changed ? Ah - that's it. TBufDataset saves the records instantly now. Probably, this should go into the official wiki site for TBufDataset. But now another issue is coming up: If I increase the number of records to 40,000 population of records slows down after about 10,000 records, speeds up again, and comes to an apparant stand-still for 32900 records. After waiting some time the record counter (which is incremented in steps of 100 in my demo) goes up to 33000. Then I gave up. Again, If I run the demo with TMemDataset, these effects do not show up. (As for the current code, see my other post of today). -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
Try call FExportDataset.MergeChangeLog before: WriteLn('Saving...'); Does anything in your timing changed ? -Laco. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
Am 27.03.2017 um 00:53 schrieb Howard Page-Clark via Lazarus: I can get small performance increases by - avoiding FieldByName() calls and using AppendRecord No, at least not for the issue I am referring to. Like in the answer to Marc's comment: This happens while the table is populated, but the delay occurs when the populated table is written to stream/file. - using SaveToFile and avoiding an intermediate memory stream Has no noticable effect. 1000 records really is not much, and flushing the memory stream to disk occurs without any delay (see my code haveing measurement points before and after writing the memory stream and after writing to file. In fact, when I noticed this effect I did not have any explicit writing code at all, I noticed an excessive delay while the dataset is closed -- this is when the BufDataset is saved automatically. - increasing the value of PacketRecords Not knowing what this is I increased the value in multiples of 10 from 1 to 1E9 and don't see any effect within the usual scatter. Clearly either the insertion algorithm should be improved, or the buffering, or the way the buffered records are written to disk. Maybe all three areas of TBufDataset can be optimised for better performance. Thanks. When I have time I'll write a bug report. The current TBufDataset is usable only as a pure in-memory table which is never written to file. BTW, in the attached modified demo code the TBufDataset can be replaced by a TMemDataset (define "USE_MEM_DATASET"), and this one is written instantly. -- snip --- program project1; {$mode objfpc}{$H+} {$DEFINE USE_MEM_DATASET} uses SysUtils, classes, db, memds, bufdataset; const TABLENAME = 'people'; //name for the database table, extension will be added DATADIR = 'data'; //subdirectory where database is stored const NUM_RECORDS = 5000; SECONDS_PER_DAY = 24 * 60 * 60; var FExportDataset: TDataset; procedure CreateDatabase; var i: Integer; fn: String; stream: TMemoryStream; t: TDateTime; begin ForceDirectories(DATADIR); fn := DATADIR + DirectorySeparator + TABLENAME + '.db'; DeleteFile(fn); {$IFDEF USE_MEM_DATASET} FExportDataset := TMemDataset.Create(nil); {$ELSE} FExportDataset := TBufDataset.Create(nil); {$ENDIF} FExportDataset.FieldDefs.Add('Last name', ftString, 15); FExportDataset.FieldDefs.Add('First name', ftString, 10); FExportDataset.FieldDefs.Add('City', ftString, 15); FExportDataset.FieldDefs.Add('Birthday', ftDate); FExportDataset.FieldDefs.Add('Salary', ftCurrency); FExportDataset.FieldDefs.Add('Work begin', ftDateTime); FExportDataset.FieldDefs.Add('Work end', ftDateTime); FExportDataset.FieldDefs.Add('Size', ftFloat); {$IFNDEF USE_MEM_DATASET} TBufDataset(FExportDataset).CreateDataset; {$ENDIF} FExportDataset.Open; // Random data for i:=1 to NUM_RECORDS do begin if (i mod 100 = 0) then WriteLn(Format('Adding record %d...', [i])); FExportDataset.Insert; FExportDataset.FieldByName('Last name').AsString := 'A'; FExportDataset.FieldByName('First name').AsString := 'B'; FExportDataset.FieldByName('City').AsString := 'C'; FExportDataset.FieldByName('Birthday').AsDateTime := 0; FExportDataset.FieldByName('Salary').AsFloat := 0; FExportDataset.FieldByName('Size').AsFloat := 0; FExportDataSet.FieldByName('Work begin').AsDateTime := 0; FExportDataSet.FieldByName('Work end').AsDateTime := 0; FExportDataset.Post; end; WriteLn('Saving...'); t := now; stream := TMemoryStream.Create; try {$IFDEF USE_MEM_DATASET} TMemDataset(FExportDataset).SaveToStream(stream); {$ELSE} TBufDataset(FExportDataset).SaveToStream(stream); {$ENDIF} stream.Position := 0; WriteLn('Written to memory stream: ', FormatFloat('0.000 s', (now - t) * SECONDS_PER_DAY)); stream.SaveToFile(fn); finally stream.Free; end; Writeln('Done. Total time needed for saving: ', FormatFloat('0.000 s', (now - t) * SECONDS_PER_DAY)); FExportDataset.Close; writeLn(Format('Created file "%s" in folder "data".', [ ExtractFileName(fn), ExtractFileDir(fn) ])); FExportDataset.Free; end; begin CreateDatabase; WriteLn; WriteLn('Press ENTER to close.'); ReadLn; end. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
Am 27.03.2017 um 09:07 schrieb Marc Santhoff via Lazarus: I didn't count, but you make extensive use of the Random() function. Could that be the cause of slowness? No, Random() is called only while records are populated - this step is completed without any noticable delay. Time is measured afterwards when the populated table is written to stream/file. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
On So, 2017-03-26 at 23:53 +0200, Werner Pamler via Lazarus wrote: > Trying to extend the import/export example of fpspreadsheet from a dBase > table to a TBufDataset I came across this issue with TBufDataset: While > data are posted to the database as quickly as usual writing to file > takes extremely long if there are more than a few thousand records. > > Run the demo attached below. On my system, I measure these (non-linearly > scaling) execution times for writing the TBufDataset table to file: > > 1000 records -- 0.9 seconds > 2000 records -- 8.8 seconds > 3000 records -- 31.1 seconds > etc. > > Compared to that, writing of the same data to a dbf file is a wink of an > eye. Is there anything which I am doing wrong? Or should I report a bug? > I didn't count, but you make extensive use of the Random() function. Could that be the cause of slowness? HTH, Marc [...] >FExportDataset.Open; > >// Random data >for i:=1 to NUM_RECORDS do begin > if (i mod 100 = 0) then >WriteLn(Format('Adding record %d...', [i])); > FExportDataset.Insert; > FExportDataset.FieldByName('Last name').AsString := > LAST_NAMES[Random(NUM_LAST_NAMES)]; > FExportDataset.FieldByName('First name').AsString := > FIRST_NAMES[Random(NUM_FIRST_NAMES)]; > FExportDataset.FieldByName('City').AsString := > CITIES[Random(NUM_CITIES)]; > FExportDataset.FieldByName('Birthday').AsDateTime := startDate - > random(maxAge); > FExportDataset.FieldByName('Salary').AsFloat := 1000+Random(9000); > FExportDataset.FieldByName('Size').AsFloat := (160 + Random(50)) / 100; > FExportDataSet.FieldByName('Work begin').AsDateTime := > 4+EncodeTime(6+Random(4), Random(60), Random(60), 0); > FExportDataSet.FieldByName('Work end').AsDateTime := > EncodeTime(15+Random(4), Random(60), Random(60), 0); > FExportDataset.Post; >end; -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
Re: [Lazarus] Writing >1000 TBufDataset records to file is extremely slow
On 26/03/17 22:53, Werner Pamler via Lazarus wrote: Trying to extend the import/export example of fpspreadsheet from a dBase table to a TBufDataset I came across this issue with TBufDataset: While data are posted to the database as quickly as usual writing to file takes extremely long if there are more than a few thousand records. Run the demo attached below. On my system, I measure these (non-linearly scaling) execution times for writing the TBufDataset table to file: 1000 records -- 0.9 seconds 2000 records -- 8.8 seconds 3000 records -- 31.1 seconds etc. Compared to that, writing of the same data to a dbf file is a wink of an eye. Is there anything which I am doing wrong? Or should I report a bug? I don't think you do anything wrong. I can get small performance increases by - avoiding FieldByName() calls and using AppendRecord - using SaveToFile and avoiding an intermediate memory stream - increasing the value of PacketRecords but the speedups are insignificant. Clearly either the insertion algorithm should be improved, or the buffering, or the way the buffered records are written to disk. Maybe all three areas of TBufDataset can be optimised for better performance. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus
[Lazarus] Writing >1000 TBufDataset records to file is extremely slow
Trying to extend the import/export example of fpspreadsheet from a dBase table to a TBufDataset I came across this issue with TBufDataset: While data are posted to the database as quickly as usual writing to file takes extremely long if there are more than a few thousand records. Run the demo attached below. On my system, I measure these (non-linearly scaling) execution times for writing the TBufDataset table to file: 1000 records -- 0.9 seconds 2000 records -- 8.8 seconds 3000 records -- 31.1 seconds etc. Compared to that, writing of the same data to a dbf file is a wink of an eye. Is there anything which I am doing wrong? Or should I report a bug? --- snip --- program project1; {$mode objfpc}{$H+} uses SysUtils, classes, db, bufdataset; const // Parameters for generating database file contents NUM_LAST_NAMES = 8; NUM_FIRST_NAMES = 8; NUM_CITIES = 10; LAST_NAMES: array[0..NUM_LAST_NAMES-1] of string = ( 'Chaplin', 'Washington', 'Dylan', 'Springsteen', 'Brando', 'Monroe', 'Dean', 'Lincoln'); FIRST_NAMES: array[0..NUM_FIRST_NAMES-1] of string = ( 'Charley', 'George', 'Bob', 'Bruce', 'Marlon', 'Marylin', 'James', 'Abraham'); CITIES: array[0..NUM_CITIES-1] of string = ( 'New York', 'Los Angeles', 'San Francisco', 'Chicago', 'Miami', 'New Orleans', 'Washington', 'Boston', 'Seattle', 'Las Vegas'); TABLENAME = 'people'; //name for the database table, extension will be added DATADIR = 'data'; //subdirectory where database file is stored DB_EXT: array[0..1] of string = ( '.dbf', '.db'); const NUM_RECORDS = 1000; SECONDS_PER_DAY = 24 * 60 * 60; var FExportDataset: TBufDataset; procedure CreateDatabase; var i: Integer; startDate: TDate; maxAge: Integer = 100 * 365; fn: String; stream: TMemoryStream; t: TDateTime; begin ForceDirectories(DATADIR); startDate := Date(); fn := DATADIR + DirectorySeparator + TABLENAME + DB_EXT[1]; DeleteFile(fn); FExportDataset := TBufDataset.Create(nil); // FExportDataset.Filename := fn; FExportDataset.FieldDefs.Add('Last name', ftString, 15); FExportDataset.FieldDefs.Add('First name', ftString, 10); FExportDataset.FieldDefs.Add('City', ftString, 15); FExportDataset.FieldDefs.Add('Birthday', ftDate); FExportDataset.FieldDefs.Add('Salary', ftCurrency); FExportDataset.FieldDefs.Add('Work begin', ftDateTime); FExportDataset.FieldDefs.Add('Work end', ftDateTime); FExportDataset.FieldDefs.Add('Size', ftFloat); FExportDataset.CreateDataset; FExportDataset.Open; // Random data for i:=1 to NUM_RECORDS do begin if (i mod 100 = 0) then WriteLn(Format('Adding record %d...', [i])); FExportDataset.Insert; FExportDataset.FieldByName('Last name').AsString := LAST_NAMES[Random(NUM_LAST_NAMES)]; FExportDataset.FieldByName('First name').AsString := FIRST_NAMES[Random(NUM_FIRST_NAMES)]; FExportDataset.FieldByName('City').AsString := CITIES[Random(NUM_CITIES)]; FExportDataset.FieldByName('Birthday').AsDateTime := startDate - random(maxAge); FExportDataset.FieldByName('Salary').AsFloat := 1000+Random(9000); FExportDataset.FieldByName('Size').AsFloat := (160 + Random(50)) / 100; FExportDataSet.FieldByName('Work begin').AsDateTime := 4+EncodeTime(6+Random(4), Random(60), Random(60), 0); FExportDataSet.FieldByName('Work end').AsDateTime := EncodeTime(15+Random(4), Random(60), Random(60), 0); FExportDataset.Post; end; WriteLn('Saving...'); t := now; stream := TMemoryStream.Create; try FExportDataset.SaveToStream(stream); stream.Position := 0; WriteLn('Written to memory stream: ', FormatFloat('0.000 s', (now - t) * SECONDS_PER_DAY)); stream.SaveToFile(fn); finally stream.Free; end; Writeln('Done. Total time needed for saving: ', FormatFloat('0.000 s', (now - t) * SECONDS_PER_DAY)); FExportDataset.Close; WriteLn(Format('Created file "%s" in folder "data".', [ ExtractFileName(fn), ExtractFileDir(fn) ])); FExportDataset.Free; end; begin CreateDatabase; WriteLn; WriteLn('Press ENTER to close.'); ReadLn; end. -- ___ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus