Can you send me the complete source file for this benchmark? On Thu, Jan 21, 2010 at 11:50 AM, Jacob Rief <jacob.r...@gmail.com> wrote:
> Hello Kenton, > > 2010/1/20 Kenton Varda <ken...@google.com>: > > (1) Normally micro-benchmarks involve running the operation in a loop > many > > times so that the total time is closer to 1s or more, not running the > > operation once and trying to time that. System clocks are not very > accurate > > at that scale, and depending on what kind of clock it is, it may actually > > take significantly longer to read the lock than it does not allocate > memory. > > (2) Your benchmark does not include the time spent actually reading the > > file, which is what I asserted would be much slower than re-allocating > the > > buffer. Sure, the seek itself is fast but it is pointless without > actually > > reading. > > now I modified the benchmark, now the code looks like this > > boost::posix_time::ptime > time0(boost::posix_time::microsec_clock::local_time()); > boost::posix_time::ptime > time1(boost::posix_time::microsec_clock::local_time()); > for (int i = 0; i<1000000; ++i) { > const void* data; > int size; > fileInStream->Seek(offset, whence); > fileInStream->Next(&data, &size); > } > boost::posix_time::ptime > time2(boost::posix_time::microsec_clock::local_time()); > for (int i = 0; i<1000000; ++i) { > const void* data; > int size; > ::lseek64(fileDescriptor, offset, whence); > fileInStream.reset(new > google::protobuf::io::FileInputStream(fileDescriptor)); > fileInStream->Next(&data, &size); > } > boost::posix_time::ptime > time3(boost::posix_time::microsec_clock::local_time()); > std::cerr << "t1: " << boost::posix_time::time_period(time1, > time2).length() << " t2: " << boost::posix_time::time_period(time2, > time3).length() << std::endl; > > The difference now is less significant, but still measurable: > t1: 00:00:02.068949 t2: 00:00:02.389942 > t1: 00:00:02.092842 t2: 00:00:02.429206 > t1: 00:00:02.080614 t2: 00:00:02.394708 > t1: 00:00:02.094289 t2: 00:00:02.429952 > t1: 00:00:02.323403 t2: 00:00:03.723459 > t1: 00:00:02.151486 t2: 00:00:03.711809 > t1: 00:00:02.084442 t2: 00:00:02.416326 > t1: 00:00:02.052930 t2: 00:00:02.383500 > > > (3) What memory allocator are you using? With tcmalloc, a malloc/free > pair > > should take around 50ns, two orders of magnitude less than your 4us > > measurement. > > The 'new' operator is not overloaded. I use gcc-Version 4.4.1 20090725 > (Red Hat 4.4.1-2) > > Regards, Jacob > > > On Wed, Jan 20, 2010 at 2:17 PM, Jacob Rief <jacob.r...@gmail.com> > wrote: > >> > >> Hello Kenton, > >> now I did some benchmarks, while Seek'ing though a FileInputStream. > >> The testing code looks like this: > >> > >> boost::posix_time::ptime > >> t0(boost::posix_time::microsec_clock::local_time()); // initialize > >> boost::posix_time > >> boost::shared_ptr<google::protobuf::io::FileInputStream> > >> fileInStream = new > >> google::protobuf::io::FileInputStream(fileDescriptor); > >> boost::posix_time::ptime > >> t1(boost::posix_time::microsec_clock::local_time()); > >> // using Seek(), the function available through my patch > >> fileInStream->Seek(offset, whence); > >> boost::posix_time::ptime > >> t2(boost::posix_time::microsec_clock::local_time()); > >> // this is the default method of achieving the same > >> ::lseek64(fileDescriptor, offset, whence); > >> fileInStream.reset(new > >> google::protobuf::io::FileInputStream(fileDescriptor)); > >> boost::posix_time::ptime > >> t3(boost::posix_time::microsec_clock::local_time()); > >> std::cerr << "t1: " << boost::posix_time::time_period(t1, > >> t2).length() << " t2: " << boost::posix_time::time_period(t2, > >> t3).length() << std::endl; > >> > >> and on my Intel Core2 Duo CPU E8400 (3.00GHz) with 4GB of RAM, > >> gcc-Version 4.4.1 20090725, compiled with -O2 > >> I get these numbers: > >> t1: 00:00:00.000001 t2: 00:00:00.000003 > >> t1: 00:00:00.000001 t2: 00:00:00.000003 > >> t1: 00:00:00.000001 t2: 00:00:00.000004 > >> t1: 00:00:00.000001 t2: 00:00:00.000007 > >> t1: 00:00:00.000001 t2: 00:00:00.000002 > >> t1: 00:00:00.000001 t2: 00:00:00.000003 > >> t1: 00:00:00.000002 t2: 00:00:00.000003 > >> t1: 00:00:00.000001 t2: 00:00:00.000004 > >> t1: 00:00:00.000001 t2: 00:00:00.000004 > >> t1: 00:00:00.000001 t2: 00:00:00.000003 > >> t1: 00:00:00.000001 t2: 00:00:00.000004 > >> > >> In absolute numbers, ~1 microsecond compared to 3-4 microseconds is > >> not a big difference, > >> but from a relative point of view, direct Seek'ing is much faster than > >> object recreation. And since > >> I have to seek a lot in the FileInputStream, the measured times will > >> accumulate. > >> > >> Regards, Jacob > >> > >> 2010/1/19 Kenton Varda <ken...@google.com>: > >> > Did you do any tests to determine if the performance difference is > >> > relevant? > >> > > >> > On Mon, Jan 18, 2010 at 3:14 PM, Jacob Rief <jacob.r...@gmail.com> > >> > wrote: > >> >> > >> >> Hello Kenton, > >> >> > >> >> 2010/1/18 Kenton Varda <ken...@google.com>: > >> >> (...snip...) > >> >> > As for code cleanliness, I find the Reset() method awkward since > the > >> >> > user > >> >> > has to remember to call it at the same time as they do some other > >> >> > operation, > >> >> > like seeking the file descriptor. Either calling Reset() or > seeking > >> >> > the > >> >> > file descriptor alone will put the object in an inconsistent state. > >> >> > It > >> >> > might make more sense to offer an actual Seek() method which can > >> >> > safely > >> >> > perform both operations together with an interface that is not so > >> >> > easy > >> >> > to > >> >> > misuse. > >> >> > >> >> I agree, a 64bit Seek function would definitely be safer. > >> >> Here is a patch for version 2.3.0 to add FileInputStream::Seek(). The > >> >> patch also fixes a compiler warning. > >> >> I also adopted my library to use this function and it seems to work. > >> >> Regards, Jacob > >> > > >> > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com<protobuf%2bunsubscr...@googlegroups.com> > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > > > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.