Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-03-01 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 What I did not understand up to now: It is really easy to keep all this
 information in a helper data structure. Why extend Bigarray?

 Gerd

A helper data structure would mean keeping track of that information on
my own complicating the source and wasting memory at runtime.

On the other hand what I suggested only makes use of what is already
there in the Bigarray data structure. The functions would only make that
information available to the user. I'm not a fan of duplicating
information that is already there. Worst case the helper data structure
could be wrong due to some bug in the code.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-02-29 Thread Goswin von Brederlow
Gerd Stolpmann i...@gerd-stolpmann.de writes:

 Am Dienstag, den 28.02.2012, 21:17 +0100 schrieb Goswin von Brederlow:
 Hi,
 
 I'm implementing a RAID in userspace using ocaml and NBD (Network Block
 Device) as protocol to export the device. For this I'm using
 Bigarray.Array1 as buffer for data and wrote myself the right read(),
 write(), pread() and pwrite() stubs. The advantage of this (over
 strings) is that Bigarrays are not copied around by the GC so they don't
 need to copy data around between the Bigarray and a temporary buffer in
 the C stub.

 I used the same approach for PlasmaFS. The bigarray-based reads and
 writes are really missing in the stdlib. (If anybody wants to
 experiment, look into the Netsys_mem module of Ocamlnet, and use the
 functions mem_read and mem_write.)

 Btw, you should try to allocate the bigarrays in a page-aligned way.
 This makes the syscalls even more efficient.

Which is actualy a requirement for libaio. At least alignment to
blocksize. My libaio bindings include a stub to create an aligned
Bigarray.

 In my use case, I did not write to devices directly, and could assume
 that the blocks are backed by the page cache. So I did not run into this
 problem you describe in the following.

 For efficiency each request stores all its data in a single
 Bigarray.Array1. For reasons of the RAID implementation large requests
 are split into 4k chunks using Bigarray.Array1.sub and grouped into
 stripes. The stripes are then acted upon independently until all stripes
 involved in a request are finished. Then the request is completed.
 
 Now I get a problem when 2 requests come in that overlap. Say I get a
 write request for blocks 0 - 6 and a read request for blocks 4-9 in
 that order. Since I already have the data for block 4-6 in memory from
 the write request I do not want to reread them from disk. On the stripe
 level the data looks like this:
 
 |W0 W1 W2 W3 W4 W5 W6 .  .  .  | write 0-6
 |W4 W5 W6 R7 R8 R9 | read  4-9
 
 For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9
 in two writes. I think I would like to avoid sending each stripe chunk
 out seperately.

 Why not? This seems to be an elegant solution, and I don't see why this
 should make the accesses slower. The time for the extra context switches
 in negligible compared to the disk accesses.

A lot of the time data will just move between caches so disk speed is
secondary. And sending each chunk seperately would mean up to 32 context
switches instead of 1. And writing to a socket in chunks can lead to the
data being send out in frames less than the MTU which would be
wastefull. So I think there is some gain in limiting this.

  On the other hand I could implement (p)readv() and
 (p)writev() stubs.
 
 Anyway, my question now is how to detect which subarrays in the stripes
 are part of a common larger array? Do I need to write my own C stub that
 looks into the internas of the arrays to see if they share a common
 ancestor?  I think that would be preferable to tracking the origin of
 each subarray myself.

 Yes, subarrays are tracked, but this feature exists only to catch the
 right moment for unmmapping bigarrays (if needed). As far as I remember,
 this is not tracked as a sub/super relationship, but the bigarrays
 accessing the same buffer share then the same buffer descriptor, and
 when the use count drops to 0, the buffer is finally destroyed. So, you
 cannot say which bigarray is the original one, and it can even be that
 the original bigarray is already deallocated but the backing buffer is
 not yet.

All subarrays share a struct caml_ba_proxy, as you say to catch the
right moment for unmmapping bigarrays. So two subarrays are part of the
same bigarray if they have the same proxy object. That seems like an
easy enough test. Which one is the original bigarray doesn't matter, at
least to me.

 On a similar note how does Bigarray.Array1.blit handle arrays that are
 part of the same larger array and overlap?
 
 --
 I'm missing functions like:
 
 val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t
 Create a merged sub-array of 2 adjacent sub-arrays of the same
 big array.

 This function would be possible to implement. The requirement would be
 that both bigarrays use the same buffer descriptor.

 val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool
 True if the 2 (sub-)arrays are part of the same big array.

 I would not call it same_parent, but same_backing_buffer.

Maybe related would be better?

 val offset : ('a, 'b, 'c) t - int
 Offset of the sub-array in the underlying big array.

 I think this information is in the current implementation not available.
 As buffer sharing is also possible after reshaping, this is also not
 meaningful in the general case.

offset = array-data - array-proxy-data;

After reshaping the offset would be the offset into the original array
after reshaping 

Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-02-29 Thread Gerd Stolpmann
Am Mittwoch, den 29.02.2012, 10:23 +0100 schrieb Goswin von Brederlow:
  For efficiency each request stores all its data in a single
  Bigarray.Array1. For reasons of the RAID implementation large requests
  are split into 4k chunks using Bigarray.Array1.sub and grouped into
  stripes. The stripes are then acted upon independently until all stripes
  involved in a request are finished. Then the request is completed.
  
  Now I get a problem when 2 requests come in that overlap. Say I get a
  write request for blocks 0 - 6 and a read request for blocks 4-9 in
  that order. Since I already have the data for block 4-6 in memory from
  the write request I do not want to reread them from disk. On the stripe
  level the data looks like this:
  
  |W0 W1 W2 W3 W4 W5 W6 .  .  .  | write 0-6
  |W4 W5 W6 R7 R8 R9 | read  4-9
  
  For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9
  in two writes. I think I would like to avoid sending each stripe chunk
  out seperately.
 
  Why not? This seems to be an elegant solution, and I don't see why this
  should make the accesses slower. The time for the extra context switches
  in negligible compared to the disk accesses.
 
 A lot of the time data will just move between caches so disk speed is
 secondary. And sending each chunk seperately would mean up to 32 context
 switches instead of 1. And writing to a socket in chunks can lead to the
 data being send out in frames less than the MTU which would be
 wastefull. So I think there is some gain in limiting this.

A context switch takes less than a microsecond on typical hardware,
including the time for rebuilding the TLB from the page table. Sending a
4K block over gigabit network takes more than 30 times longer. I doubt
you can gain from combining several chunks.

Sending the data in frames less than the MTU can normally only happen
when you disable the Nagle algorithm (did you?). And even then, you only
need to fill the TCP buffer faster than the network can transport the
data to avoid it. 

   On the other hand I could implement (p)readv() and
  (p)writev() stubs.
  
  Anyway, my question now is how to detect which subarrays in the stripes
  are part of a common larger array? Do I need to write my own C stub that
  looks into the internas of the arrays to see if they share a common
  ancestor?  I think that would be preferable to tracking the origin of
  each subarray myself.
 
  Yes, subarrays are tracked, but this feature exists only to catch the
  right moment for unmmapping bigarrays (if needed). As far as I remember,
  this is not tracked as a sub/super relationship, but the bigarrays
  accessing the same buffer share then the same buffer descriptor, and
  when the use count drops to 0, the buffer is finally destroyed. So, you
  cannot say which bigarray is the original one, and it can even be that
  the original bigarray is already deallocated but the backing buffer is
  not yet.
 
 All subarrays share a struct caml_ba_proxy, as you say to catch the
 right moment for unmmapping bigarrays. So two subarrays are part of the
 same bigarray if they have the same proxy object. That seems like an
 easy enough test. Which one is the original bigarray doesn't matter, at
 least to me.
 
  On a similar note how does Bigarray.Array1.blit handle arrays that are
  part of the same larger array and overlap?
  
  --
  I'm missing functions like:
  
  val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t
  Create a merged sub-array of 2 adjacent sub-arrays of the same
  big array.
 
  This function would be possible to implement. The requirement would be
  that both bigarrays use the same buffer descriptor.
 
  val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool
  True if the 2 (sub-)arrays are part of the same big array.
 
  I would not call it same_parent, but same_backing_buffer.
 
 Maybe related would be better?
 
  val offset : ('a, 'b, 'c) t - int
  Offset of the sub-array in the underlying big array.
 
  I think this information is in the current implementation not available.
  As buffer sharing is also possible after reshaping, this is also not
  meaningful in the general case.
 
 offset = array-data - array-proxy-data;
 
 After reshaping the offset would be the offset into the original array
 after reshaping that too.
 
 For me the offset would be relevant to see which 2 sub arrays could be
 merged with super above. That it wouldn't always be meaningfull I
 could accept.

What I did not understand up to now: It is really easy to keep all this
information in a helper data structure. Why extend Bigarray?

Gerd

 
  Gerd
 
 MfG
 Goswin
 

-- 

Gerd Stolpmann, Darmstadt, Germanyg...@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:http://www.camlcity.org/contact.html
Company homepage:   

[Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-02-28 Thread Goswin von Brederlow
Hi,

I'm implementing a RAID in userspace using ocaml and NBD (Network Block
Device) as protocol to export the device. For this I'm using
Bigarray.Array1 as buffer for data and wrote myself the right read(),
write(), pread() and pwrite() stubs. The advantage of this (over
strings) is that Bigarrays are not copied around by the GC so they don't
need to copy data around between the Bigarray and a temporary buffer in
the C stub.

For efficiency each request stores all its data in a single
Bigarray.Array1. For reasons of the RAID implementation large requests
are split into 4k chunks using Bigarray.Array1.sub and grouped into
stripes. The stripes are then acted upon independently until all stripes
involved in a request are finished. Then the request is completed.

Now I get a problem when 2 requests come in that overlap. Say I get a
write request for blocks 0 - 6 and a read request for blocks 4-9 in
that order. Since I already have the data for block 4-6 in memory from
the write request I do not want to reread them from disk. On the stripe
level the data looks like this:

|W0 W1 W2 W3 W4 W5 W6 .  .  .  | write 0-6
|W4 W5 W6 R7 R8 R9 | read  4-9

For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9
in two writes. I think I would like to avoid sending each stripe chunk
out seperately. On the other hand I could implement (p)readv() and
(p)writev() stubs.

Anyway, my question now is how to detect which subarrays in the stripes
are part of a common larger array? Do I need to write my own C stub that
looks into the internas of the arrays to see if they share a common
ancestor?  I think that would be preferable to tracking the origin of
each subarray myself.


On a similar note how does Bigarray.Array1.blit handle arrays that are
part of the same larger array and overlap?

--
I'm missing functions like:

val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t
Create a merged sub-array of 2 adjacent sub-arrays of the same
big array.

val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool
True if the 2 (sub-)arrays are part of the same big array.

val offset : ('a, 'b, 'c) t - int
Offset of the sub-array in the underlying big array.

MfG
Goswin

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...

2012-02-28 Thread Gerd Stolpmann
Am Dienstag, den 28.02.2012, 21:17 +0100 schrieb Goswin von Brederlow:
 Hi,
 
 I'm implementing a RAID in userspace using ocaml and NBD (Network Block
 Device) as protocol to export the device. For this I'm using
 Bigarray.Array1 as buffer for data and wrote myself the right read(),
 write(), pread() and pwrite() stubs. The advantage of this (over
 strings) is that Bigarrays are not copied around by the GC so they don't
 need to copy data around between the Bigarray and a temporary buffer in
 the C stub.

I used the same approach for PlasmaFS. The bigarray-based reads and
writes are really missing in the stdlib. (If anybody wants to
experiment, look into the Netsys_mem module of Ocamlnet, and use the
functions mem_read and mem_write.)

Btw, you should try to allocate the bigarrays in a page-aligned way.
This makes the syscalls even more efficient.

In my use case, I did not write to devices directly, and could assume
that the blocks are backed by the page cache. So I did not run into this
problem you describe in the following.

 For efficiency each request stores all its data in a single
 Bigarray.Array1. For reasons of the RAID implementation large requests
 are split into 4k chunks using Bigarray.Array1.sub and grouped into
 stripes. The stripes are then acted upon independently until all stripes
 involved in a request are finished. Then the request is completed.
 
 Now I get a problem when 2 requests come in that overlap. Say I get a
 write request for blocks 0 - 6 and a read request for blocks 4-9 in
 that order. Since I already have the data for block 4-6 in memory from
 the write request I do not want to reread them from disk. On the stripe
 level the data looks like this:
 
 |W0 W1 W2 W3 W4 W5 W6 .  .  .  | write 0-6
 |W4 W5 W6 R7 R8 R9 | read  4-9
 
 For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9
 in two writes. I think I would like to avoid sending each stripe chunk
 out seperately.

Why not? This seems to be an elegant solution, and I don't see why this
should make the accesses slower. The time for the extra context switches
in negligible compared to the disk accesses.

  On the other hand I could implement (p)readv() and
 (p)writev() stubs.
 
 Anyway, my question now is how to detect which subarrays in the stripes
 are part of a common larger array? Do I need to write my own C stub that
 looks into the internas of the arrays to see if they share a common
 ancestor?  I think that would be preferable to tracking the origin of
 each subarray myself.

Yes, subarrays are tracked, but this feature exists only to catch the
right moment for unmmapping bigarrays (if needed). As far as I remember,
this is not tracked as a sub/super relationship, but the bigarrays
accessing the same buffer share then the same buffer descriptor, and
when the use count drops to 0, the buffer is finally destroyed. So, you
cannot say which bigarray is the original one, and it can even be that
the original bigarray is already deallocated but the backing buffer is
not yet.

 On a similar note how does Bigarray.Array1.blit handle arrays that are
 part of the same larger array and overlap?
 
 --
 I'm missing functions like:
 
 val super : ('a, 'b, 'c) t - ('a, 'b, 'c) t - ('a, 'b, 'c) t
 Create a merged sub-array of 2 adjacent sub-arrays of the same
 big array.

This function would be possible to implement. The requirement would be
that both bigarrays use the same buffer descriptor.

 val same_parent : ('a, 'b, 'c) t - ('a, 'b, 'c) t - bool
 True if the 2 (sub-)arrays are part of the same big array.

I would not call it same_parent, but same_backing_buffer.

 val offset : ('a, 'b, 'c) t - int
 Offset of the sub-array in the underlying big array.

I think this information is in the current implementation not available.
As buffer sharing is also possible after reshaping, this is also not
meaningful in the general case.

Gerd

 MfG
 Goswin
 

-- 

Gerd Stolpmann, Darmstadt, Germanyg...@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:http://www.camlcity.org/contact.html
Company homepage:   http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.



-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs