Roberto Cornacchia wrote: > I come back to this issue with some more interesting results. > > I did mention in the past that 'COPY INTO ... FROM stdin ..' seems to be > a lot slower than 'COPY INTO ... FROM <filename> ...' > > Actually, I have just found something that may shift the problem away > from the stdin itself. > > Look at the following two methods of restoring a previously dumped > (rather small) database. > > RESTORE METHOD 1 (far too slow): > > $ mclient < backup.sql > [ 62 ] > [ 70199 ] > [ 288 ] > [ 70322 ] > [ 69 ] > [ 423 ] > [ 280596 ] > [ 4302 ] > [ 49038 ] > [ 1 ] > [ 10 ] > [ 6943 ] > [ 6943 ] > [ 3424 ] > [ 3424 ] > [ 3424 ] > [ 1 ] > [ 6943 ] > [ 3424 ] > TIME elapsed: 68 seconds > > > RESTORE METHOD 2: > > $ echo "\< backup.sql" | mclient > [ 62 ] > [ 70199 ] > [ 288 ] > [ 70322 ] > [ 69 ] > [ 423 ] > [ 280596 ] > [ 4302 ] > [ 49038 ] > [ 1 ] > [ 10 ] > [ 6943 ] > [ 6943 ] > [ 3424 ] > [ 3424 ] > [ 3424 ] > [ 1 ] > [ 6943 ] > [ 3424 ] > TIME elapsed: 1 seconds > > Can anyone explain this difference?
When mclient reads from standard input, it reads a line at a time. When it reads from file (either using the \< construct or as a command line argument (i.e. without <)) it reads in large chunks. > One further consideration: Method 2 works fine, but cannot be used to > load gzipped dumps. Maybe this can be a feature request: '\<' should > accept gzipped input. Patches are gratefully accepted. ;-) > Roberto > > > On Mon, 2009-05-18 at 13:03 +0200, Stefan Manegold wrote: >> On Mon, May 18, 2009 at 11:57:57AM +0200, Stefan de Konink wrote: >>> On Mon, 18 May 2009, Stefan Manegold wrote: >>> >>>> In case you came to your conclusion from experimental/emperical study, I >>>> would be very interested to know about the experiemntes performed and the >>>> results measured. >>>> In that case, I would be verythankful if you could share the informative >>>> details with us, i.e. detailed descriptions of >>>> - experimental setup (HW, SW, datasets, workload/queries) >>>> - measured results (what was measured how, and what are the results?) >>>> - (your) discussion/analysis/interpretation of the results >>> I thought the MADAM meeting was for this. >> Ok. fine. >> >> ... we should then revive the ideas of keeping note of MADAMs (e.g., on a >> Wiki or alike) as reference for those who cannot be present ... >> >>>> Could you please elaborate on whether you consider you setup/schema >>>> "representaive" and/or whether/to which extend you think that your results >>>> for one specific case could be generalized to other cases? >>> Technically the only thing I have to prove to get my code reverted is that >>> there are cases where late contraints hurt performance. >> IMHO, our task should be the following: >> >> - analyze whether there is indeed a significat performance difference >> between early and late constraint checking; >> (5% do not justify any further time to be spent on this) >> - and if so, analyze under which circumstances which altertative performs >> (significatly) better or worse than the other; >> - and (only) then try to find out where the (significat) performance >> difference comes from >> - and (only) then >> + either try to eliminate the difference; >> + or >> * make dump/restore use the most suitable alternative (as far as >> possible with reasonable effort); >> * and document our experiences to give adivce for users >> >> Stefan >> >>> Stefan >>> -- Sjoerd Mullender
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------
_______________________________________________ Monetdb-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/monetdb-developers
