Can you send me the complete source file for this benchmark?

On Thu, Jan 21, 2010 at 11:50 AM, Jacob Rief <jacob.r...@gmail.com> wrote:

> Hello Kenton,
>
> 2010/1/20 Kenton Varda <ken...@google.com>:
> > (1) Normally micro-benchmarks involve running the operation in a loop
> many
> > times so that the total time is closer to 1s or more, not running the
> > operation once and trying to time that.  System clocks are not very
> accurate
> > at that scale, and depending on what kind of clock it is, it may actually
> > take significantly longer to read the lock than it does not allocate
> memory.
> > (2) Your benchmark does not include the time spent actually reading the
> > file, which is what I asserted would be much slower than re-allocating
> the
> > buffer.  Sure, the seek itself is fast but it is pointless without
> actually
> > reading.
>
> now I modified the benchmark, now the code looks like this
>
> boost::posix_time::ptime
> time0(boost::posix_time::microsec_clock::local_time());
> boost::posix_time::ptime
> time1(boost::posix_time::microsec_clock::local_time());
> for (int i = 0; i<1000000; ++i) {
>        const void* data;
>        int size;
>        fileInStream->Seek(offset, whence);
>        fileInStream->Next(&data, &size);
> }
> boost::posix_time::ptime
> time2(boost::posix_time::microsec_clock::local_time());
> for (int i = 0; i<1000000; ++i) {
>        const void* data;
>        int size;
>        ::lseek64(fileDescriptor, offset, whence);
>        fileInStream.reset(new
> google::protobuf::io::FileInputStream(fileDescriptor));
>        fileInStream->Next(&data, &size);
> }
> boost::posix_time::ptime
> time3(boost::posix_time::microsec_clock::local_time());
> std::cerr << "t1: " <<  boost::posix_time::time_period(time1,
> time2).length() << " t2: " << boost::posix_time::time_period(time2,
> time3).length() << std::endl;
>
> The difference now is less significant, but still measurable:
> t1: 00:00:02.068949 t2: 00:00:02.389942
> t1: 00:00:02.092842 t2: 00:00:02.429206
> t1: 00:00:02.080614 t2: 00:00:02.394708
> t1: 00:00:02.094289 t2: 00:00:02.429952
> t1: 00:00:02.323403 t2: 00:00:03.723459
> t1: 00:00:02.151486 t2: 00:00:03.711809
> t1: 00:00:02.084442 t2: 00:00:02.416326
> t1: 00:00:02.052930 t2: 00:00:02.383500
>
> > (3) What memory allocator are you using?  With tcmalloc, a malloc/free
> pair
> > should take around 50ns, two orders of magnitude less than your 4us
> > measurement.
>
> The 'new' operator is not overloaded. I use gcc-Version 4.4.1 20090725
> (Red Hat 4.4.1-2)
>
> Regards, Jacob
>
> > On Wed, Jan 20, 2010 at 2:17 PM, Jacob Rief <jacob.r...@gmail.com>
> wrote:
> >>
> >> Hello Kenton,
> >> now I did some benchmarks, while Seek'ing though a FileInputStream.
> >> The testing code looks like this:
> >>
> >>  boost::posix_time::ptime
> >> t0(boost::posix_time::microsec_clock::local_time()); // initialize
> >> boost::posix_time
> >>  boost::shared_ptr<google::protobuf::io::FileInputStream>
> >> fileInStream = new
> >> google::protobuf::io::FileInputStream(fileDescriptor);
> >>  boost::posix_time::ptime
> >> t1(boost::posix_time::microsec_clock::local_time());
> >>   // using Seek(), the function available through my patch
> >>  fileInStream->Seek(offset, whence);
> >>  boost::posix_time::ptime
> >> t2(boost::posix_time::microsec_clock::local_time());
> >>  // this is the default method of achieving the same
> >>  ::lseek64(fileDescriptor, offset, whence);
> >>  fileInStream.reset(new
> >> google::protobuf::io::FileInputStream(fileDescriptor));
> >>  boost::posix_time::ptime
> >> t3(boost::posix_time::microsec_clock::local_time());
> >>  std::cerr << "t1: " <<        boost::posix_time::time_period(t1,
> >> t2).length() << " t2: " << boost::posix_time::time_period(t2,
> >> t3).length() << std::endl;
> >>
> >> and on my Intel Core2 Duo CPU E8400 (3.00GHz) with 4GB of RAM,
> >> gcc-Version 4.4.1 20090725, compiled with -O2
> >> I get these numbers:
> >> t1: 00:00:00.000001 t2: 00:00:00.000003
> >> t1: 00:00:00.000001 t2: 00:00:00.000003
> >> t1: 00:00:00.000001 t2: 00:00:00.000004
> >> t1: 00:00:00.000001 t2: 00:00:00.000007
> >> t1: 00:00:00.000001 t2: 00:00:00.000002
> >> t1: 00:00:00.000001 t2: 00:00:00.000003
> >> t1: 00:00:00.000002 t2: 00:00:00.000003
> >> t1: 00:00:00.000001 t2: 00:00:00.000004
> >> t1: 00:00:00.000001 t2: 00:00:00.000004
> >> t1: 00:00:00.000001 t2: 00:00:00.000003
> >> t1: 00:00:00.000001 t2: 00:00:00.000004
> >>
> >> In absolute numbers, ~1 microsecond compared to 3-4 microseconds is
> >> not a big difference,
> >> but from a relative point of view, direct Seek'ing is much faster than
> >> object recreation. And since
> >> I have to seek a lot in the FileInputStream, the measured times will
> >> accumulate.
> >>
> >> Regards, Jacob
> >>
> >> 2010/1/19 Kenton Varda <ken...@google.com>:
> >> > Did you do any tests to determine if the performance difference is
> >> > relevant?
> >> >
> >> > On Mon, Jan 18, 2010 at 3:14 PM, Jacob Rief <jacob.r...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hello Kenton,
> >> >>
> >> >> 2010/1/18 Kenton Varda <ken...@google.com>:
> >> >> (...snip...)
> >> >> > As for code cleanliness, I find the Reset() method awkward since
> the
> >> >> > user
> >> >> > has to remember to call it at the same time as they do some other
> >> >> > operation,
> >> >> > like seeking the file descriptor.  Either calling Reset() or
> seeking
> >> >> > the
> >> >> > file descriptor alone will put the object in an inconsistent state.
> >> >> >  It
> >> >> > might make more sense to offer an actual Seek() method which can
> >> >> > safely
> >> >> > perform both operations together with an interface that is not so
> >> >> > easy
> >> >> > to
> >> >> > misuse.
> >> >>
> >> >> I agree, a 64bit Seek function would definitely be safer.
> >> >> Here is a patch for version 2.3.0 to add FileInputStream::Seek(). The
> >> >> patch also fixes a compiler warning.
> >> >> I also adopted my library to use this function and it seems to work.
> >> >> Regards, Jacob
> >> >
> >> >
> >
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com<protobuf%2bunsubscr...@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Reply via email to