I am trying to heavily optimise a file reader dealing with multi-gigabyte
files. The "standard"
way of doing it in .NET using FileStream as shown below is certainly
not bad, but I would like to know how to get even better perf.

"Standard" .NET way I use as a base for benchmarking:

const int BufferSize = 65536; // this seems to work best on my computer
FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read,
FileShare.Read, BufferSize, FileOptions.SequentialScan )


Options I tried:

1- Create a custom Encoding for ASCII as a proof of concept for other
encodings. Using a specially crafted memcpy function, i get around 2x-3x the
output speed compared to UTF8Encoding (which is faster than ASCIIEncoding,
for some reasons). Basically, the memcpy function outputs
0000+byte in destination char* buffer for each byte read in source byte*
buffer.

2- Create a SafeFileHandle using win32 function CreateFile with
FILE_FLAG_NO_BUFFERING flag. I get very bad results with this, with a buffer
of 64K. My cluster size is 4K. Sample code:

using (SafeFileHandle handle = CreateFile(path, (uint) FileAccess.Read,
(uint) FileShare.Read, IntPtr.Zero, (uint) FileMode.Open,
FILE_FLAG_NO_BUFFERING, IntPtr.Zero))
{
  using (FileStream fs = new FileStream(handle, FileAccess.Read, BufferSize,
false))
  {
    byte[] buffer = new byte[BufferSize];
    int count = BufferSize;

    while (count == BufferSize && (count = fs.Read(buffer, 0, BufferSize)) >
0)
    {
    }
  }
}

3- Use memory-mapped file API. I found a wrapper on GotDotNet by MetalWrench
(http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=647E4735-9DCE-42F3-9432-CB3E9A1ECA5B).
I also get very bad results with the code below:

static void Test11(string path)
{
  Encoding encoding = new FastASCIIEncoding();
  char[] chars = new char[BufferSize];

  using (MemoryMappedFileStream mmf = new MemoryMappedFileStream(path,
FileAccess.Read))//, 0, 105571505, ""))
  {
    byte[] buffer = new byte[BufferSize];
    int count;

    while ((count = mmf.Read(buffer, 0, BufferSize)) > 0)
    {
    }
  }
}

-- 
Sébastien
www.sebastienlorion.com

===================================
This list is hosted by DevelopMentor®  http://www.develop.com

View archives and manage your subscription(s) at http://discuss.develop.com

Reply via email to