Re: [1/4] DST: Distributed storage documentation.

2007-12-17 Thread Kay Sievers
On Dec 17, 2007 4:03 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote:

 +++ b/Documentation/dst/sysfs.txt
 @@ -0,0 +1,33 @@
 +This file describes sysfs files created for each storage.
 +
 +1. Per-storage files.
 +Each storage has its own dir /sysfs/devices/$storage_name,

 +2. Per-node files.
 +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie

As already pointed out last time, you can't reference /sys/devices/ directly,
please use the path from the bus/class directory which points there.

Thanks,
Kay
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Kay Sievers
On Dec 10, 2007 12:47 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 diff --git a/Documentation/dst/sysfs.txt b/Documentation/dst/sysfs.txt
 new file mode 100644
 index 000..79d79dc
 --- /dev/null
 +++ b/Documentation/dst/sysfs.txt
 @@ -0,0 +1,30 @@
 +This file describes sysfs files created for each storage.
 +
 +1. Per-storage files.
 +Each storage has its own dir /sysfs/devices/$storage_name,

It's always /sys/devices/.

 +which contains following files:
 +
 +alg - contains name of the algorithm used to created given storage
 +name - name of the storage
 +nodes - map of the storage (list of nodes and their sizes and starts)
 +remove_all_nodes - writable file which allows to remove all nodes from given
 +   storage
 +n-$start-$cookie - per node directory, where
 +   $start - start of the given node in sectors,
 +   $cookie - unique node's id used by DST
 +
 +2. Per-node files.
 +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie
 +directory, described above.

To which class or bus do the devices you create belong? Care to show a
tree or ls -la of the device?

Kay
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Evgeniy Polyakov
On Mon, Dec 10, 2007 at 01:51:43PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
 On Dec 10, 2007 12:47 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
  diff --git a/Documentation/dst/sysfs.txt b/Documentation/dst/sysfs.txt
  new file mode 100644
  index 000..79d79dc
  --- /dev/null
  +++ b/Documentation/dst/sysfs.txt
  @@ -0,0 +1,30 @@
  +This file describes sysfs files created for each storage.
  +
  +1. Per-storage files.
  +Each storage has its own dir /sysfs/devices/$storage_name,
 
 It's always /sys/devices/.

I meant that for each new device, it will be placed into
/sys/devices/its_name, but it can also be accessed via
/sys/bus/dst/devices/

  +which contains following files:
  +
  +alg - contains name of the algorithm used to created given storage
  +name - name of the storage
  +nodes - map of the storage (list of nodes and their sizes and starts)
  +remove_all_nodes - writable file which allows to remove all nodes from 
  given
  +   storage
  +n-$start-$cookie - per node directory, where
  +   $start - start of the given node in sectors,
  +   $cookie - unique node's id used by DST
  +
  +2. Per-node files.
  +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie
  +directory, described above.
 
 To which class or bus do the devices you create belong? Care to show a
 tree or ls -la of the device?

It is 'dst' bus.

uganda:~/codes# ls -la /sys/devices/staorge/
total 0
drwxr-xr-x 4 root root0 2007-12-10 11:46 .
drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
-r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
-r--r--r-- 1 root root 4096 2007-12-10 11:46 name
-r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
drwxr-xr-x 2 root root0 2007-12-10 11:46 power
-rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
-rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent
uganda:~/codes# ls -l /sys/bus/dst/
total 0
drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
-rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
--w--- 1 root root 4096 2007-12-10 11:46 drivers_probe


 Kay

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Kay Sievers
On Mon, 2007-12-10 at 15:58 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 01:51:43PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
  On Dec 10, 2007 12:47 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
   diff --git a/Documentation/dst/sysfs.txt b/Documentation/dst/sysfs.txt
   new file mode 100644
   index 000..79d79dc
   --- /dev/null
   +++ b/Documentation/dst/sysfs.txt
   @@ -0,0 +1,30 @@
   +This file describes sysfs files created for each storage.
   +
   +1. Per-storage files.
   +Each storage has its own dir /sysfs/devices/$storage_name,
  
  It's always /sys/devices/.
 
 I meant that for each new device, it will be placed into
 /sys/devices/its_name, but it can also be accessed via
 /sys/bus/dst/devices/

Still, it looks like a path. :)

Please don't reference any device directly with a /sys/devices/ path.
You have to use the subsystem links to the devices
in /sys/bus/dst/devices/. Devices are free to move around
in /sys/devices, even during runtime. Yours don't do, but anyway, please
remove all mentioning of direct access to /sys/devices/.

Btw, where is the top-level /sys/devices/storage/ coming from? I don't
see that in the code. We don't accept any new virtual parents here.
Your devices will automatically appear in /sys/devices/virtual/dst/, and
not below your own parent. But that path does not matter anyway, because
you should only access them from the /sys/bus/dst/devices/ directory.

And in general please don't claim generic names like storage in any
namespace for a very specific subsystem like this.

   +which contains following files:
   +
   +alg - contains name of the algorithm used to created given storage
   +name - name of the storage
   +nodes - map of the storage (list of nodes and their sizes and starts)
   +remove_all_nodes - writable file which allows to remove all nodes from 
   given
   +   storage
   +n-$start-$cookie - per node directory, where
   +   $start - start of the given node in sectors,
   +   $cookie - unique node's id used by DST
   +
   +2. Per-node files.
   +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie
   +directory, described above.
  
  To which class or bus do the devices you create belong? Care to show a
  tree or ls -la of the device?
 
 It is 'dst' bus.
 
 uganda:~/codes# ls -la /sys/devices/staorge/
 total 0
 drwxr-xr-x 4 root root0 2007-12-10 11:46 .
 drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
 -r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
 lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
 drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
 -r--r--r-- 1 root root 4096 2007-12-10 11:46 name
 -r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
 drwxr-xr-x 2 root root0 2007-12-10 11:46 power
 -rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
 lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
 -rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent

Ok, how does:
  ls -l /sys/devices/storage/n-0-81003e24117
look?

 uganda:~/codes# ls -l /sys/bus/dst/
 total 0
 drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
 drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
 -rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
 --w--- 1 root root 4096 2007-12-10 11:46 drivers_probe

How does:
  ls -l /sys/bus/dst/devices
look?


Further questions:
Why do you do your own refcounting instead of using kref?
Why don't you use groups for the attributes?
Why don't you use default attributes for the device, where you get all
error handling done by the core.

Kay

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Evgeniy Polyakov
On Mon, Dec 10, 2007 at 03:31:48PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
  I meant that for each new device, it will be placed into
  /sys/devices/its_name, but it can also be accessed via
  /sys/bus/dst/devices/
 
 Still, it looks like a path. :)
 
 Please don't reference any device directly with a /sys/devices/ path.
 You have to use the subsystem links to the devices
 in /sys/bus/dst/devices/. Devices are free to move around
 in /sys/devices, even during runtime. Yours don't do, but anyway, please
 remove all mentioning of direct access to /sys/devices/.

Ok, I will update documentation to reference /sys/bus/dst/devices
instead of /sys/devices

 Btw, where is the top-level /sys/devices/storage/ coming from? I don't
 see that in the code. We don't accept any new virtual parents here.

 Your devices will automatically appear in /sys/devices/virtual/dst/, and
 not below your own parent. But that path does not matter anyway, because
 you should only access them from the /sys/bus/dst/devices/ directory.
 
 And in general please don't claim generic names like storage in any
 namespace for a very specific subsystem like this.

It is not a parent - it is an example for device called 'storage', if it
will be called 'testing', then path will be /sys/devices/testing or more
correct /sys/bus/dst/devices/testing :)

  It is 'dst' bus.
  
  uganda:~/codes# ls -la /sys/devices/staorge/
  total 0
  drwxr-xr-x 4 root root0 2007-12-10 11:46 .
  drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
  -r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
  lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
  drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
  -r--r--r-- 1 root root 4096 2007-12-10 11:46 name
  -r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
  drwxr-xr-x 2 root root0 2007-12-10 11:46 power
  -rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
  lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
  -rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent
 
 Ok, how does:
   ls -l /sys/devices/storage/n-0-81003e24117
 look?

uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
total 0
drwxr-xr-x 2 root root0 2007-12-10 13:23 power
-r--r--r-- 1 root root 4096 2007-12-10 13:30 size
-r--r--r-- 1 root root 4096 2007-12-10 13:30 start
-r--r--r-- 1 root root 4096 2007-12-10 13:30 type
-rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent


  uganda:~/codes# ls -l /sys/bus/dst/
  total 0
  drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
  drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
  -rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
  --w--- 1 root root 4096 2007-12-10 11:46 drivers_probe
 
 How does:
   ls -l /sys/bus/dst/devices
 look?

uganda:~/codes# ls -la /sys/bus/dst/devices/
total 0
drwxr-xr-x 2 root root 0 2007-12-10 13:30 .
drwxr-xr-x 4 root root 0 2007-12-10 13:22 ..
lrwxrwxrwx 1 root root 0 2007-12-10 13:30 storage - ../../../devices/storage


Here 'storage' is just a name for device called 'storage', it can be
anything else.
 
 Further questions:
 Why do you do your own refcounting instead of using kref?

That's because I always used atomic operations as a reference counters
and did not tried krefs :)
They are the same actually (module tricky arches where smp_mb_* are
required), so I can replace them in the next release.

 Why don't you use groups for the attributes?

For 3-4 attributes it is faster to register them in a loop than typing
another structure :)

 Why don't you use default attributes for the device, where you get all
 error handling done by the core.

What is 'default attributes' and for what devices?
All my sysfs files are so much trivial, so they do not need anything
special and I do not see what is error handling you mentioned.

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Evgeniy Polyakov
On Mon, Dec 10, 2007 at 05:50:55PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
  Further questions:
  Why do you do your own refcounting instead of using kref?
 
 That's because I always used atomic operations as a reference counters
 and did not tried krefs :)
 They are the same actually (module tricky arches where smp_mb_* are
 required), so I can replace them in the next release.

Actually not - I have to set reference counter to something other than 1
or +/- 1, and thus will have to call kref_get() in a loop, which is a
very ugly step. Is there kref_set() or somethinglike that? At least not
in 2.6.22 what I'm using for now.

Sigh, I've converted most of the DST already...

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Kay Sievers
On Mon, 2007-12-10 at 17:50 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 03:31:48PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
   I meant that for each new device, it will be placed into
   /sys/devices/its_name, but it can also be accessed via
   /sys/bus/dst/devices/
  
  Still, it looks like a path. :)
  
  Please don't reference any device directly with a /sys/devices/ path.
  You have to use the subsystem links to the devices
  in /sys/bus/dst/devices/. Devices are free to move around
  in /sys/devices, even during runtime. Yours don't do, but anyway, please
  remove all mentioning of direct access to /sys/devices/.
 
 Ok, I will update documentation to reference /sys/bus/dst/devices
 instead of /sys/devices

Great, thanks!

  Btw, where is the top-level /sys/devices/storage/ coming from? I don't
  see that in the code. We don't accept any new virtual parents here.
 
  Your devices will automatically appear in /sys/devices/virtual/dst/, and
  not below your own parent. But that path does not matter anyway, because
  you should only access them from the /sys/bus/dst/devices/ directory.
  
  And in general please don't claim generic names like storage in any
  namespace for a very specific subsystem like this.
 
 It is not a parent - it is an example for device called 'storage', if it
 will be called 'testing', then path will be /sys/devices/testing or more
 correct /sys/bus/dst/devices/testing :)

Ah, I see.

   It is 'dst' bus.
   
   uganda:~/codes# ls -la /sys/devices/staorge/
   total 0
   drwxr-xr-x 4 root root0 2007-12-10 11:46 .
   drwxr-xr-x 9 root root0 2007-12-10 11:46 ..
   -r--r--r-- 1 root root 4096 2007-12-10 11:46 alg
   lrwxrwxrwx 1 root root0 2007-12-10 11:46 bus - ../../bus/dst
   drwxr-xr-x 3 root root0 2007-12-10 11:46 n-0-81003e24117
   -r--r--r-- 1 root root 4096 2007-12-10 11:46 name
   -r--r--r-- 1 root root 4096 2007-12-10 11:46 nodes
   drwxr-xr-x 2 root root0 2007-12-10 11:46 power
   -rw-r--r-- 1 root root 4096 2007-12-10 11:46 remove_all_nodes
   lrwxrwxrwx 1 root root0 2007-12-10 11:46 subsystem - ../../bus/dst
   -rw-r--r-- 1 root root 4096 2007-12-10 11:46 uevent
  
  Ok, how does:
ls -l /sys/devices/storage/n-0-81003e24117
  look?
 
 uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
 total 0
 drwxr-xr-x 2 root root0 2007-12-10 13:23 power
 -r--r--r-- 1 root root 4096 2007-12-10 13:30 size
 -r--r--r-- 1 root root 4096 2007-12-10 13:30 start
 -r--r--r-- 1 root root 4096 2007-12-10 13:30 type
 -rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent

This is a struct device instance without a subsystem (bus/class),
right? It will not send an uevent to userspace. Is that intended? Why
don't you add them all to the dst bus? 

   uganda:~/codes# ls -l /sys/bus/dst/
   total 0
   drwxr-xr-x 2 root root0 2007-12-10 09:52 devices
   drwxr-xr-x 2 root root0 2007-12-10 09:52 drivers
   -rw-r--r-- 1 root root 4096 2007-12-10 11:46 drivers_autoprobe
   --w--- 1 root root 4096 2007-12-10 11:46 drivers_probe
  
  How does:
ls -l /sys/bus/dst/devices
  look?
 
 uganda:~/codes# ls -la /sys/bus/dst/devices/
 total 0
 drwxr-xr-x 2 root root 0 2007-12-10 13:30 .
 drwxr-xr-x 4 root root 0 2007-12-10 13:22 ..
 lrwxrwxrwx 1 root root 0 2007-12-10 13:30 storage - ../../../devices/storage
 
 Here 'storage' is just a name for device called 'storage', it can be
 anything else.

Fine.

  Further questions:
  Why do you do your own refcounting instead of using kref?
 
 That's because I always used atomic operations as a reference counters
 and did not tried krefs :)
 They are the same actually (module tricky arches where smp_mb_* are
 required), so I can replace them in the next release.

On Mon, 2007-12-10 at 18:12 +0300, Evgeniy Polyakov wrote:
 Actually not - I have to set reference counter to something other than 1
 or +/- 1, and thus will have to call kref_get() in a loop, which is a
 very ugly step. Is there kref_set() or somethinglike that? At least not
 in 2.6.22 what I'm using for now.

Yeah, a loop would look pretty ugly. How about just adding kref_set(),
if you need it.

  Why don't you use groups for the attributes?
 
 For 3-4 attributes it is faster to register them in a loop than typing
 another structure :)

Yeah, but if you would need to recover from an error when the creation
of a file fails, a group would do the proper rollback.

  Why don't you use default attributes for the device, where you get all
  error handling done by the core.
 
 What is 'default attributes' and for what devices?
 All my sysfs files are so much trivial, so they do not need anything
 special and I do not see what is error handling you mentioned.

If all devices of a subsystem (bus/class) are of the same type, you can
set a default array of attributes in the struct bus/class to be
created at every device. If you have multiple types of devices in the
same subsytem (bus/class) you can to assign a the device_type, which
has the 

Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Evgeniy Polyakov
On Mon, Dec 10, 2007 at 08:02:28PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
  uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
  total 0
  drwxr-xr-x 2 root root0 2007-12-10 13:23 power
  -r--r--r-- 1 root root 4096 2007-12-10 13:30 size
  -r--r--r-- 1 root root 4096 2007-12-10 13:30 start
  -r--r--r-- 1 root root 4096 2007-12-10 13:30 type
  -rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent
 
 This is a struct device instance without a subsystem (bus/class),
 right? It will not send an uevent to userspace. Is that intended? Why
 don't you add them all to the dst bus? 

I created dst bus for storage devices only, nodes are very different
objects, and actually they do not need any events from above, but I need
to put some attributes somewhere, so it is 'empty' device.

  Actually not - I have to set reference counter to something other than 1
  or +/- 1, and thus will have to call kref_get() in a loop, which is a
  very ugly step. Is there kref_set() or somethinglike that? At least not
  in 2.6.22 what I'm using for now.
 
 Yeah, a loop would look pretty ugly. How about just adding kref_set(),
 if you need it.

Well, then it distributed storage will not be able to build as
standalone module, and kref_set() itself will not be accepted as a single 
patch, since there are no in-kernel users :)
It is easily doable though.

   Why don't you use groups for the attributes?
  
  For 3-4 attributes it is faster to register them in a loop than typing
  another structure :)
 
 Yeah, but if you would need to recover from an error when the creation
 of a file fails, a group would do the proper rollback.

I do not care about such errors - if there is such an error for a file,
which exports information about type of the node (i.e. string L or R)
or some other very meaningful info, then system has enough to care about
instead of this, so dst does not do anything special - it ignores such
errors :)

On exit path it will be checked and removed correctly.
If there will be additional sysfs files, I think group is a good way to
implement them.

   Why don't you use default attributes for the device, where you get all
   error handling done by the core.
  
  What is 'default attributes' and for what devices?
  All my sysfs files are so much trivial, so they do not need anything
  special and I do not see what is error handling you mentioned.
 
 If all devices of a subsystem (bus/class) are of the same type, you can
 set a default array of attributes in the struct bus/class to be
 created at every device. If you have multiple types of devices in the
 same subsytem (bus/class) you can to assign a the device_type, which
 has the default attribute group.
 That way the core will create the files before the event is sent out to
 userspace, and the files can be access from the event itself. Not sure
 if that is needed for dst.

Ok, I see.

DST right now has 3 types of files - storage files, it is common for
every storage device; node files, which are the same for every node; and
per-algorithm private devices - they can be different (actually only
mirroring algorithm exports something to userspace).

I think it is possible to use default attributes for storage devices,
but node device does not have a bus/class, so they will be untouched.

 Thanks,
 Kay

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Kay Sievers
On Mon, 2007-12-10 at 22:33 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 08:02:28PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
   uganda:~/codes# ls -l /sys/devices/storage/n-0-81003ebc220/
   total 0
   drwxr-xr-x 2 root root0 2007-12-10 13:23 power
   -r--r--r-- 1 root root 4096 2007-12-10 13:30 size
   -r--r--r-- 1 root root 4096 2007-12-10 13:30 start
   -r--r--r-- 1 root root 4096 2007-12-10 13:30 type
   -rw-r--r-- 1 root root 4096 2007-12-10 13:30 uevent
  
  This is a struct device instance without a subsystem (bus/class),
  right? It will not send an uevent to userspace. Is that intended? Why
  don't you add them all to the dst bus? 
 
 I created dst bus for storage devices only, nodes are very different
 objects, and actually they do not need any events from above, but I need
 to put some attributes somewhere, so it is 'empty' device.

Ok.

   Actually not - I have to set reference counter to something other than 1
   or +/- 1, and thus will have to call kref_get() in a loop, which is a
   very ugly step. Is there kref_set() or somethinglike that? At least not
   in 2.6.22 what I'm using for now.
  
  Yeah, a loop would look pretty ugly. How about just adding kref_set(),
  if you need it.
 
 Well, then it distributed storage will not be able to build as
 standalone module, and kref_set() itself will not be accepted as a single 
 patch, since there are no in-kernel users :)
 It is easily doable though.

Most rules have exceptions. :) Send a patch, so we can see how it looks
like.

Why don't you use groups for the attributes?
   
   For 3-4 attributes it is faster to register them in a loop than typing
   another structure :)
  
  Yeah, but if you would need to recover from an error when the creation
  of a file fails, a group would do the proper rollback.
 
 I do not care about such errors - if there is such an error for a file,
 which exports information about type of the node (i.e. string L or R)
 or some other very meaningful info, then system has enough to care about
 instead of this, so dst does not do anything special - it ignores such
 errors :)
 
 On exit path it will be checked and removed correctly.
 If there will be additional sysfs files, I think group is a good way to
 implement them.
 
Why don't you use default attributes for the device, where you get all
error handling done by the core.
   
   What is 'default attributes' and for what devices?
   All my sysfs files are so much trivial, so they do not need anything
   special and I do not see what is error handling you mentioned.
  
  If all devices of a subsystem (bus/class) are of the same type, you can
  set a default array of attributes in the struct bus/class to be
  created at every device. If you have multiple types of devices in the
  same subsytem (bus/class) you can to assign a the device_type, which
  has the default attribute group.
  That way the core will create the files before the event is sent out to
  userspace, and the files can be access from the event itself. Not sure
  if that is needed for dst.
 
 Ok, I see.
 
 DST right now has 3 types of files - storage files, it is common for
 every storage device; node files, which are the same for every node; and
 per-algorithm private devices - they can be different (actually only
 mirroring algorithm exports something to userspace).
 
 I think it is possible to use default attributes for storage devices,
 but node device does not have a bus/class, so they will be untouched.

Sounds fine.

Thanks,
Kay

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Evgeniy Polyakov
On Mon, Dec 10, 2007 at 08:44:55PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
Actually not - I have to set reference counter to something other than 1
or +/- 1, and thus will have to call kref_get() in a loop, which is a
very ugly step. Is there kref_set() or somethinglike that? At least not
in 2.6.22 what I'm using for now.
   
   Yeah, a loop would look pretty ugly. How about just adding kref_set(),
   if you need it.
  
  Well, then it distributed storage will not be able to build as
  standalone module, and kref_set() itself will not be accepted as a single 
  patch, since there are no in-kernel users :)
  It is easily doable though.
 
 Most rules have exceptions. :) Send a patch, so we can see how it looks
 like.

It looks really non-trivial :)

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/include/linux/kref.h b/include/linux/kref.h
index 6fee353..5d18563 100644
--- a/include/linux/kref.h
+++ b/include/linux/kref.h
@@ -24,6 +24,7 @@ struct kref {
atomic_t refcount;
 };
 
+void kref_set(struct kref *kref, int num);
 void kref_init(struct kref *kref);
 void kref_get(struct kref *kref);
 int kref_put(struct kref *kref, void (*release) (struct kref *kref));
diff --git a/lib/kref.c b/lib/kref.c
index a6dc3ec..40aa9f9 100644
--- a/lib/kref.c
+++ b/lib/kref.c
@@ -15,13 +15,23 @@
 #include linux/module.h
 
 /**
+ * kref_set - initialize object and set refcount to requested number.
+ * @kref: object in question.
+ * @num: initial reference counter
+ */
+void kref_set(struct kref *kref, int num)
+{
+   atomic_set(kref-refcount, num);
+   smp_mb();
+}
+
+/**
  * kref_init - initialize object.
  * @kref: object in question.
  */
 void kref_init(struct kref *kref)
 {
-   atomic_set(kref-refcount,1);
-   smp_mb();
+   kref_set(kref, 1);
 }
 
 /**

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Kay Sievers
On Mon, 2007-12-10 at 22:51 +0300, Evgeniy Polyakov wrote:
 On Mon, Dec 10, 2007 at 08:44:55PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
 wrote:
 Actually not - I have to set reference counter to something other 
 than 1
 or +/- 1, and thus will have to call kref_get() in a loop, which is a
 very ugly step. Is there kref_set() or somethinglike that? At least 
 not
 in 2.6.22 what I'm using for now.

Yeah, a loop would look pretty ugly. How about just adding kref_set(),
if you need it.
   
   Well, then it distributed storage will not be able to build as
   standalone module, and kref_set() itself will not be accepted as a single 
   patch, since there are no in-kernel users :)
   It is easily doable though.
  
  Most rules have exceptions. :) Send a patch, so we can see how it looks
  like.
 
 It looks really non-trivial :)

Yeah, it does. :)
We miss an EXPORT_SYMBOL(), right?

Kay

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] DST: Distributed storage documentation.

2007-12-10 Thread Evgeniy Polyakov
On Mon, Dec 10, 2007 at 08:56:49PM +0100, Kay Sievers ([EMAIL PROTECTED]) wrote:
 On Mon, 2007-12-10 at 22:51 +0300, Evgeniy Polyakov wrote:
  On Mon, Dec 10, 2007 at 08:44:55PM +0100, Kay Sievers ([EMAIL PROTECTED]) 
  wrote:
  Actually not - I have to set reference counter to something other 
  than 1
  or +/- 1, and thus will have to call kref_get() in a loop, which is 
  a
  very ugly step. Is there kref_set() or somethinglike that? At least 
  not
  in 2.6.22 what I'm using for now.
 
 Yeah, a loop would look pretty ugly. How about just adding kref_set(),
 if you need it.

Well, then it distributed storage will not be able to build as
standalone module, and kref_set() itself will not be accepted as a 
single 
patch, since there are no in-kernel users :)
It is easily doable though.
   
   Most rules have exceptions. :) Send a patch, so we can see how it looks
   like.
  
  It looks really non-trivial :)
 
 Yeah, it does. :)
 We miss an EXPORT_SYMBOL(), right?

Yep :)

diff --git a/include/linux/kref.h b/include/linux/kref.h
index 6fee353..5d18563 100644
--- a/include/linux/kref.h
+++ b/include/linux/kref.h
@@ -24,6 +24,7 @@ struct kref {
atomic_t refcount;
 };
 
+void kref_set(struct kref *kref, int num);
 void kref_init(struct kref *kref);
 void kref_get(struct kref *kref);
 int kref_put(struct kref *kref, void (*release) (struct kref *kref));
diff --git a/lib/kref.c b/lib/kref.c
index a6dc3ec..9ecd6e8 100644
--- a/lib/kref.c
+++ b/lib/kref.c
@@ -15,13 +15,23 @@
 #include linux/module.h
 
 /**
+ * kref_set - initialize object and set refcount to requested number.
+ * @kref: object in question.
+ * @num: initial reference counter
+ */
+void kref_set(struct kref *kref, int num)
+{
+   atomic_set(kref-refcount, num);
+   smp_mb();
+}
+
+/**
  * kref_init - initialize object.
  * @kref: object in question.
  */
 void kref_init(struct kref *kref)
 {
-   atomic_set(kref-refcount,1);
-   smp_mb();
+   kref_set(kref, 1);
 }
 
 /**
@@ -61,6 +71,7 @@ int kref_put(struct kref *kref, void (*release)(struct kref 
*kref))
return 0;
 }
 
+EXPORT_SYMBOL(kref_set);
 EXPORT_SYMBOL(kref_init);
 EXPORT_SYMBOL(kref_get);
 EXPORT_SYMBOL(kref_put);

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] dst: Distributed storage documentation.

2007-12-03 Thread Evgeniy Polyakov
Hi Matt.

On Sun, Dec 02, 2007 at 10:50:59PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
wrote:
  Distributed storage documentation.
  
  Algorithms used in the system, userspace interfaces
  (sysfs dirs and files), design and implementation details
  are described here.
 
 Can you give us a summary of how this differs from using device mapper
 with NBD?

From the higher point ov view it does not, but it operates quite differently:
it has async processing of the requests, thus not blocking, it has
different protocol with smaller overhead, supports strong checksums, has
in-kernel export server, which supports simple security attributes (i.e.
allow to connect, to read or write). It uses smaller amount of memory
(zero additional allocations in the common path for linear mapping,
not including network allocations, it uses smaller amount of additional
allocations for mirroring case).
DST supports failure recovery in case of dropped connection (core will
reconnect to the remote node when it is ready), thus it is possible to
turn off and on remote nodes without special administration steps. DST
has simple autoconfiguration at the startup time (support checksums and
storage size autonegotiation). It is possible to turn one of the mirror
nodes off and use it as a offline backup, since dst mirror node stores
data at the end of the storage, so it can be mounted locally.

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] dst: Distributed storage documentation.

2007-12-02 Thread Matt Mackall
On Thu, Nov 29, 2007 at 03:53:23PM +0300, Evgeniy Polyakov wrote:
 
 Distributed storage documentation.
 
 Algorithms used in the system, userspace interfaces
 (sysfs dirs and files), design and implementation details
 are described here.

Can you give us a summary of how this differs from using device mapper
with NBD?

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html