On 1/5/11 12:44 AM, Christopher Smith wrote:
> On Tue, Jan 4, 2011 at 9:11 PM, Da Zheng <[email protected]> wrote:
>
>> On 1/4/11 5:17 PM, Christopher Smith wrote:
>>> If you use direct I/O to reduce CPU time, that means you are saving CPU
>> via
>>> DMA. If you are using Java's heap though, you can kiss that goodbye.
>> The buffer for direct I/O cannot be allocated from Java's heap anyway, I
>> don't
>> understand what you mean?
>
>
> The DMA buffer cannot be on Java's heap, but in the typical use case (say
> Hadoop), it would certainly have to get copied either in to our out from
> Java's heap, and that's going to get the CPU involved whether you like it
> nor not. If you stay entirely off the Java heap, you really don't get to use
> much of Java's object model or capabilities, so you have to wonder why use
> Java in the first place.
true. I wrote the code with JNI, and found it's still very close to its best
performance when doing one or even two memory copy.
>
>
>>> That said, I'm surprised that the Atom can't keep up with magnetic disk
>>> unless you have a striped array. 100MB/s shouldn't be too taxing. Is it
>>> possible you're doing something wrong or your CPU is otherwise occupied?
>> Yes, my C program can reach 100MB/s or even 110MB/s when writing data to
>> the
>> disk sequentially, but with direct I/O enabled, the maximal throughput is
>> about
>> 140MB/s. But the biggest difference is CPU usage.
>> Without direct I/O, operating system uses a lot of CPU time (the data below
>> is
>> got with top, and this is a dual-core processor with hyperthread enabled).
>> Cpu(s): 3.4%us, 32.8%sy, 0.0%ni, 50.0%id, 12.1%wa, 0.0%hi, 1.6%si,
>> 0.0%st
>> But with direct I/O, the system time can be as little as 3%.
>>
>
> I'm surprised that system time is really that high. We did Atom experiments
> where it wasn't even close to that. Are you using a memory mapped file? If
No, I don't. just simply write a large chunk of data to the memory and the code
is attached below. Right now the buffer size is 1MB, I think it's big enough to
get the best performance.
> not are you buffering your writes? Is there perhaps
> something dysfunctional about the drive controller/driver you are using?
I'm not sure. It's also odd to me, but I thought it's what I can get from a Atom
processor. I guess I need to do some profiling.
Also, which Atom processors did you use? do you have hyperthread enabled?
Best,
Da
int main (int argc, char *argv[])
{
char *out_file;
int outfd;
ssize_t size;
time_t start_time2;
long size1 = 0;
out_file = argv[1];
outfd = open (out_file, O_CREAT | O_WRONLY, S_IWUSR | S_IRUSR);
if (outfd < 0) {
perror ("open");
return -1;
}
buf = mmap (0, bufsize, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);
start_time2 = start_time = time (NULL);
signal(SIGINT , sighandler);
int offset = 0;
while (1) {
fill_data ((int *) buf, bufsize);
size = write (outfd, buf, bufsize);
if (size < 0) {
perror ("fwrite");
return 1;
}
offset += size;
tot_size += size;
size1 += size;
// if (posix_fadvise (outfd, 0, offset, POSIX_FADV_NOREUSE) < 0)
// perror ("posix_fadvise");
time_t end_time = time (NULL);
if (end_time - start_time2 > 5) {
printf ("current rate: %ld\n",
(long) (size1 / (end_time - start_time2)));
size1 = 0;
start_time2 = end_time;
}
}
}