Hi Bob Thanks for your quick response. I really appreciate your reply! We are using HP Storage. I guess, our infrastructure is ok.
Lets discuss on "cp A1 A1.bk". Correct me if I am wrong. In this cp, OS needs to cache all A1.bk data blocks from storage to overwrite with A1 block. I guess, some time would be utilized for this. However, if A1.bk is new, then it would take free data Blocks from super block. I guess, this should be faster. Apart from this, read/write hits can make some difference in performance. When you use dd, I guess most of your data would be in buffer-cache and read-hit rate would be more And very few calls would go to backend storage. Does this make any sense? Thanks Hemant -----Original Message----- From: Bob Proulx [mailto:[email protected]] Sent: Wednesday, December 22, 2010 9:17 PM To: Hemant Rumde Cc: [email protected]; [email protected] Subject: Re: cp command performance Hemant Rumde wrote: > I do not log any bug for cp command. In that case I will close the bug report that you have opened. Let's have the discussion on the discussion mailing list [email protected]. That is the more appropriate place. I have set the mail headers to direct discussion there but if your mailer doesn't comply please manually redirect it. > In our company, we copy huge Cobol files before processing data. This > is to rollback our data files. Suppose A1 is my huge file of 60GB and > A1.bk is its backup file, before we process ( change ) data into A1. > Then which of our method would be faster? > > 1. Method-1 ( A1.bk exists ) > $ cp A1 A1.bk > > 2. Method-2 > $ rm -f A1.bk > $ cp A1 A1.bk > > 3. Method-3 > $ cp --remove-destination A1 A1,bk All three of those should be virtually the same, especially the last two. But benchmarking it is always good. I created a 10G test file using dd and copied it once to set up the test and then performed the following operations on a ext3 filesystem. $ time cp testdata testdata.bak real 3m34.435s user 0m0.108s sys 0m30.950s $ time ( rm -f testdata.bak ; cp testdata testdata.bak ) real 3m27.941s user 0m0.092s sys 0m30.914s $ time cp --remove-destination testdata testdata.bak real 3m36.931s user 0m0.068s sys 0m30.862s As you can see the times for all three operations are with limits of being exactly the same. > This operation is very simple. But our operators tell, in some cases > cp takes longer time. How can we reduce copying time? I do not doubt that there will be differences in times consumed for just the raw command. With such a large file I think this will be dependent upon outside influences. Such as what filesystem you are using for the copy and how much ram you have available for buffer cache and whether extraneous sync and fsync calls are happening at the same time and so forth. I could send for-examples but I don't want to send you off on in the wrong direction and so will resist. Bob --------------------------------------------------------- NOTICE: The information contained in this electronic mail message is confidential and intended only for certain recipients. If you are not an intended recipient, you are hereby notified that any disclosure, reproduction, distribution or other use of this communication and any attachments is strictly prohibited. If you have received this communication in error, please notify the sender by reply transmission and delete the message without copying or disclosing it. ============================================================================================
