Re: [4/4] DST: Algorithms used in distributed storage.

2007-12-12 Thread Evgeniy Polyakov
On Wed, Dec 12, 2007 at 12:12:47PM +0300, Dmitry Monakhov ([EMAIL PROTECTED]) 
wrote:
> On 14:47 Mon 10 Dec , Evgeniy Polyakov wrote:
> > 
> > Algorithms used in distributed storage.
> > Mirror and linear mapping code.
> Hi, i've finally take a look on your DST solution.
> It seems what your current implementation will not work on nonstandard
> devices for example software raid0.
> other comments are follows:

> > +static int dst_mirror_process_node_data(struct dst_node *n,
> > +   struct dst_mirror_node_data *ndata, int op)

> > +
> > +   kunmap(cmp->page);
> << MINOR_BUG:
>You has forgot to unmap page on error path, so IMHO it is better to move
>kunmap to "err_out_free_cmp" label.

Yep, I will fix this.

> > +   priv = kzalloc(sizeof(struct dst_mirror_priv), GFP_KERNEL);
> > +   if (!priv)
> > +   return -ENOMEM;
> > +
> > +   priv->chunk_num = st->disk_size;
> > +
> > +   priv->chunk = vmalloc(DIV_ROUND_UP(priv->chunk_num, BITS_PER_LONG) * 
> > sizeof(long));
> << Ohhh. My. I want to add my 500G hdd. Do you really wanna
>say what i have to store 128Mb in memory object for this.

Right now yes. There was a code which used single bit for bigger
data units, but I dropped it because of resync troubles (i.e. when
one single sector has been updated, it requires to resync the whole
block). I can not say which case is better though.

> > +   dprintk("%s: start: %llu, size: %llu/%u, bio: %p, req: %p, "
> > +   "node: %p.\n",
> > +   __func__, req->start, req->size, nr_pages, bio,
> > +   req, req->node);
> > +
> > +   err = n->st->queue->make_request_fn(n->st->queue, bio);
> << Why direct make_request_fn instead of generic_make_request?

generic_make_request() will queue the bio in this case,
so I call request_fn directly.

> > +   for (i = 0; i < DIV_ROUND_UP(priv->chunk_num, BITS_PER_LONG); ++i) {
> > +   int bit, num, start;
> > +   unsigned long word = priv->chunk[i];
> > +
> > +   if (!word)
> > +   continue;
> > +
> > +   num = 0;
> > +   start = -1;
> > +   while (word && num < BITS_PER_LONG) {
> > +   bit = __ffs(word);
> > +   if (start == -1)
> > +   start = bit;
> > +   num++;
> << MINOR_BUG: Seems you have misstyped here. AFAIU @num represent position
>of last non zero bit (start + num == last_non_zero_bit_pos)
>   if (start == -1) {
>   start = bit;
>   num = 1;
>   } else
> num += bit;

Yes, you are right of course.
Since I shift word to more than a single bit, @num has to be update
accordingly.

> > +   word >>= (bit+1);

Dmitry, thanks a lot for comments, I will fix issues you pointed in the
next release, although will stay bitmap case opened for a while.

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [4/4] DST: Algorithms used in distributed storage.

2007-12-12 Thread Dmitry Monakhov
On 14:47 Mon 10 Dec , Evgeniy Polyakov wrote:
> 
> Algorithms used in distributed storage.
> Mirror and linear mapping code.
Hi, i've finally take a look on your DST solution.
It seems what your current implementation will not work on nonstandard
devices for example software raid0.
other comments are follows:
> 
> Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>
> 
> 
> diff --git a/drivers/block/dst/alg_linear.c b/drivers/block/dst/alg_linear.c
> new file mode 100644
> index 000..9dc0976
> --- /dev/null
> +++ b/drivers/block/dst/alg_linear.c
> @@ -0,0 +1,105 @@
> +/*
> + * 2007+ Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
> + * All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static struct dst_alg *alg_linear;
> +
> +/*
> + * This callback is invoked when node is removed from storage.
> + */
> +static void dst_linear_del_node(struct dst_node *n)
> +{
> +}
> +
> +/*
> + * This callback is invoked when node is added to storage.
> + */
> +static int dst_linear_add_node(struct dst_node *n)
> +{
> + struct dst_storage *st = n->st;
> +
> + dprintk("%s: disk_size: %llu, node_size: %llu.\n",
> + __func__, st->disk_size, n->size);
> +
> + mutex_lock(>tree_lock);
> + n->start = st->disk_size;
> + st->disk_size += n->size;
> + set_capacity(st->disk, st->disk_size);
> + mutex_unlock(>tree_lock);
> +
> + return 0;
> +}
> +
> +static int dst_linear_remap(struct dst_request *req)
> +{
> + int err;
> +
> + if (req->node->bdev) {
> + generic_make_request(req->bio);
> + return 0;
> + }
> +
> + err = kst_check_permissions(req->state, req->bio);
> + if (err)
> + return err;
> +
> + return req->state->ops->push(req);
> +}
> +
> +/*
> + * Failover callback - it is invoked each time error happens during
> + * request processing.
> + */
> +static int dst_linear_error(struct kst_state *st, int err)
> +{
> + if (err)
> + set_bit(DST_NODE_FROZEN, >node->flags);
> + else
> + clear_bit(DST_NODE_FROZEN, >node->flags);
> + return 0;
> +}
> +
> +static struct dst_alg_ops alg_linear_ops = {
> + .remap  = dst_linear_remap,
> + .add_node   = dst_linear_add_node,
> + .del_node   = dst_linear_del_node,
> + .error  = dst_linear_error,
> + .owner  = THIS_MODULE,
> +};
> +
> +static int __devinit alg_linear_init(void)
> +{
> + alg_linear = dst_alloc_alg("alg_linear", _linear_ops);
> + if (!alg_linear)
> + return -ENOMEM;
> +
> + return 0;
> +}
> +
> +static void __devexit alg_linear_exit(void)
> +{
> + dst_remove_alg(alg_linear);
> +}
> +
> +module_init(alg_linear_init);
> +module_exit(alg_linear_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Evgeniy Polyakov <[EMAIL PROTECTED]>");
> +MODULE_DESCRIPTION("Linear distributed algorithm.");
> diff --git a/drivers/block/dst/alg_mirror.c b/drivers/block/dst/alg_mirror.c
> new file mode 100644
> index 000..3c457ff
> --- /dev/null
> +++ b/drivers/block/dst/alg_mirror.c
> @@ -0,0 +1,1128 @@
> +/*
> + * 2007+ Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
> + * All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct dst_mirror_node_data
> +{
> + u64 age;
> +};
> +
> +struct dst_mirror_priv
> +{
> + unsigned intchunk_num;
> +
> + u64 last_start;
> +
> + spinlock_t  backlog_lock;
> + struct list_headbacklog_list;
> +
> + struct dst_mirror_node_data old_data, new_data;
> +
> + unsigned long   *chunk;
> +};
> +
> +static struct dst_alg *alg_mirror;
> +static struct bio_set *dst_mirror_bio_set;
> +
> +static int dst_mirror_resync(struct dst_node *n, int ndp);
> +
> +static void dst_mirror_mark_sync(struct 

Re: [4/4] DST: Algorithms used in distributed storage.

2007-12-12 Thread Dmitry Monakhov
On 14:47 Mon 10 Dec , Evgeniy Polyakov wrote:
 
 Algorithms used in distributed storage.
 Mirror and linear mapping code.
Hi, i've finally take a look on your DST solution.
It seems what your current implementation will not work on nonstandard
devices for example software raid0.
other comments are follows:
 
 Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]
 
 
 diff --git a/drivers/block/dst/alg_linear.c b/drivers/block/dst/alg_linear.c
 new file mode 100644
 index 000..9dc0976
 --- /dev/null
 +++ b/drivers/block/dst/alg_linear.c
 @@ -0,0 +1,105 @@
 +/*
 + * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
 + * All rights reserved.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + */
 +
 +#include linux/module.h
 +#include linux/kernel.h
 +#include linux/init.h
 +#include linux/dst.h
 +
 +static struct dst_alg *alg_linear;
 +
 +/*
 + * This callback is invoked when node is removed from storage.
 + */
 +static void dst_linear_del_node(struct dst_node *n)
 +{
 +}
 +
 +/*
 + * This callback is invoked when node is added to storage.
 + */
 +static int dst_linear_add_node(struct dst_node *n)
 +{
 + struct dst_storage *st = n-st;
 +
 + dprintk(%s: disk_size: %llu, node_size: %llu.\n,
 + __func__, st-disk_size, n-size);
 +
 + mutex_lock(st-tree_lock);
 + n-start = st-disk_size;
 + st-disk_size += n-size;
 + set_capacity(st-disk, st-disk_size);
 + mutex_unlock(st-tree_lock);
 +
 + return 0;
 +}
 +
 +static int dst_linear_remap(struct dst_request *req)
 +{
 + int err;
 +
 + if (req-node-bdev) {
 + generic_make_request(req-bio);
 + return 0;
 + }
 +
 + err = kst_check_permissions(req-state, req-bio);
 + if (err)
 + return err;
 +
 + return req-state-ops-push(req);
 +}
 +
 +/*
 + * Failover callback - it is invoked each time error happens during
 + * request processing.
 + */
 +static int dst_linear_error(struct kst_state *st, int err)
 +{
 + if (err)
 + set_bit(DST_NODE_FROZEN, st-node-flags);
 + else
 + clear_bit(DST_NODE_FROZEN, st-node-flags);
 + return 0;
 +}
 +
 +static struct dst_alg_ops alg_linear_ops = {
 + .remap  = dst_linear_remap,
 + .add_node   = dst_linear_add_node,
 + .del_node   = dst_linear_del_node,
 + .error  = dst_linear_error,
 + .owner  = THIS_MODULE,
 +};
 +
 +static int __devinit alg_linear_init(void)
 +{
 + alg_linear = dst_alloc_alg(alg_linear, alg_linear_ops);
 + if (!alg_linear)
 + return -ENOMEM;
 +
 + return 0;
 +}
 +
 +static void __devexit alg_linear_exit(void)
 +{
 + dst_remove_alg(alg_linear);
 +}
 +
 +module_init(alg_linear_init);
 +module_exit(alg_linear_exit);
 +
 +MODULE_LICENSE(GPL);
 +MODULE_AUTHOR(Evgeniy Polyakov [EMAIL PROTECTED]);
 +MODULE_DESCRIPTION(Linear distributed algorithm.);
 diff --git a/drivers/block/dst/alg_mirror.c b/drivers/block/dst/alg_mirror.c
 new file mode 100644
 index 000..3c457ff
 --- /dev/null
 +++ b/drivers/block/dst/alg_mirror.c
 @@ -0,0 +1,1128 @@
 +/*
 + * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
 + * All rights reserved.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + */
 +
 +#include linux/module.h
 +#include linux/kernel.h
 +#include linux/init.h
 +#include linux/poll.h
 +#include linux/dst.h
 +
 +struct dst_mirror_node_data
 +{
 + u64 age;
 +};
 +
 +struct dst_mirror_priv
 +{
 + unsigned intchunk_num;
 +
 + u64 last_start;
 +
 + spinlock_t  backlog_lock;
 + struct list_headbacklog_list;
 +
 + struct dst_mirror_node_data old_data, new_data;
 +
 + unsigned long   *chunk;
 +};
 +
 +static struct dst_alg *alg_mirror;
 +static struct bio_set *dst_mirror_bio_set;
 +
 +static int dst_mirror_resync(struct dst_node *n, int ndp);
 +
 +static void dst_mirror_mark_sync(struct dst_node *n)
 +{
 + if (test_bit(DST_NODE_NOTSYNC, n-flags)) {
 + 

Re: [4/4] DST: Algorithms used in distributed storage.

2007-12-12 Thread Evgeniy Polyakov
On Wed, Dec 12, 2007 at 12:12:47PM +0300, Dmitry Monakhov ([EMAIL PROTECTED]) 
wrote:
 On 14:47 Mon 10 Dec , Evgeniy Polyakov wrote:
  
  Algorithms used in distributed storage.
  Mirror and linear mapping code.
 Hi, i've finally take a look on your DST solution.
 It seems what your current implementation will not work on nonstandard
 devices for example software raid0.
 other comments are follows:

  +static int dst_mirror_process_node_data(struct dst_node *n,
  +   struct dst_mirror_node_data *ndata, int op)

  +
  +   kunmap(cmp-page);
  MINOR_BUG:
You has forgot to unmap page on error path, so IMHO it is better to move
kunmap to err_out_free_cmp label.

Yep, I will fix this.

  +   priv = kzalloc(sizeof(struct dst_mirror_priv), GFP_KERNEL);
  +   if (!priv)
  +   return -ENOMEM;
  +
  +   priv-chunk_num = st-disk_size;
  +
  +   priv-chunk = vmalloc(DIV_ROUND_UP(priv-chunk_num, BITS_PER_LONG) * 
  sizeof(long));
  Ohhh. My. I want to add my 500G hdd. Do you really wanna
say what i have to store 128Mb in memory object for this.

Right now yes. There was a code which used single bit for bigger
data units, but I dropped it because of resync troubles (i.e. when
one single sector has been updated, it requires to resync the whole
block). I can not say which case is better though.

  +   dprintk(%s: start: %llu, size: %llu/%u, bio: %p, req: %p, 
  +   node: %p.\n,
  +   __func__, req-start, req-size, nr_pages, bio,
  +   req, req-node);
  +
  +   err = n-st-queue-make_request_fn(n-st-queue, bio);
  Why direct make_request_fn instead of generic_make_request?

generic_make_request() will queue the bio in this case,
so I call request_fn directly.

  +   for (i = 0; i  DIV_ROUND_UP(priv-chunk_num, BITS_PER_LONG); ++i) {
  +   int bit, num, start;
  +   unsigned long word = priv-chunk[i];
  +
  +   if (!word)
  +   continue;
  +
  +   num = 0;
  +   start = -1;
  +   while (word  num  BITS_PER_LONG) {
  +   bit = __ffs(word);
  +   if (start == -1)
  +   start = bit;
  +   num++;
  MINOR_BUG: Seems you have misstyped here. AFAIU @num represent position
of last non zero bit (start + num == last_non_zero_bit_pos)
   if (start == -1) {
   start = bit;
   num = 1;
   } else
 num += bit;

Yes, you are right of course.
Since I shift word to more than a single bit, @num has to be update
accordingly.

  +   word = (bit+1);

Dmitry, thanks a lot for comments, I will fix issues you pointed in the
next release, although will stay bitmap case opened for a while.

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/