Re: [PATCH 1/5] sched, rt: move .switched_from out of the scope of CONFIG_SMP

2013-12-28 Thread Zhi Yong Wu
On Sat, Dec 28, 2013 at 5:48 PM, Kirill Tkhai  wrote:
> On Сб, дек 28, 2013 at 05:37:32 +0800, Zhi Yong Wu wrote:
>> On Sat, Dec 28, 2013 at 5:19 PM, Kirill Tkhai  wrote:
>> > On Пт, дек 27, 2013 at 07:41:00 +0800, Zhi Yong Wu wrote:
>> >> From: Zhi Yong Wu 
>> >>
>> >> .switched_from shouldn't be initialized in the scope of CONFIG_SMP,
>> >> so this patch is trying to move it out.
>> >>
>> >> Signed-off-by: Zhi Yong Wu 
>> >> ---
>> >>  kernel/sched/rt.c |2 +-
>> >>  1 files changed, 1 insertions(+), 1 deletions(-)
>> >>
>> >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>> >> index 1c40655..f34d41b 100644
>> >> --- a/kernel/sched/rt.c
>> >> +++ b/kernel/sched/rt.c
>> >> @@ -2002,9 +2002,9 @@ const struct sched_class rt_sched_class = {
>> >>   .pre_schedule   = pre_schedule_rt,
>> >>   .post_schedule  = post_schedule_rt,
>> >>   .task_woken = task_woken_rt,
>> >> - .switched_from  = switched_from_rt,
>> >>  #endif
>> >>
>> >> + .switched_from  = switched_from_rt,
>> >>   .set_curr_task  = set_curr_task_rt,
>> >>   .task_tick  = task_tick_rt,
>> >
>> > This will not be compilable in !SMP mode because the body of 
>> > switched_from_rt()
>> > is still under CONFIG_SMP define.
>> How about also removing its body out?
>
> switched_from_rt() is necessary only in SMP mode, so I think we should
> not change anything connected with it here. It's already initialized
> properly.
pls ignore this patch, thanks.
>
>> >
>> > Kirill
>>
>>
>>
>> --
>> Regards,
>>
>> Zhi Yong Wu



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] sched, rt: move .switched_from out of the scope of CONFIG_SMP

2013-12-28 Thread Zhi Yong Wu
On Sat, Dec 28, 2013 at 5:19 PM, Kirill Tkhai  wrote:
> On Пт, дек 27, 2013 at 07:41:00 +0800, Zhi Yong Wu wrote:
>> From: Zhi Yong Wu 
>>
>> .switched_from shouldn't be initialized in the scope of CONFIG_SMP,
>> so this patch is trying to move it out.
>>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>  kernel/sched/rt.c |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>> index 1c40655..f34d41b 100644
>> --- a/kernel/sched/rt.c
>> +++ b/kernel/sched/rt.c
>> @@ -2002,9 +2002,9 @@ const struct sched_class rt_sched_class = {
>>   .pre_schedule   = pre_schedule_rt,
>>   .post_schedule  = post_schedule_rt,
>>   .task_woken = task_woken_rt,
>> - .switched_from  = switched_from_rt,
>>  #endif
>>
>> + .switched_from  = switched_from_rt,
>>   .set_curr_task  = set_curr_task_rt,
>>   .task_tick  = task_tick_rt,
>
> This will not be compilable in !SMP mode because the body of 
> switched_from_rt()
> is still under CONFIG_SMP define.
How about also removing its body out?
>
> Kirill



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/5] Sched: Some trivial typo fixes

2013-12-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

They were found when i review sched related src code.

Zhi Yong Wu (5):
  sched, rt: move .switched_from out of the scope of CONFIG_SMP
  sched, fair: fix the comment of move_tasks()
  sched, fair: fix the typo in select_idle_sibling()
  sched, fair: fix the comment of select_task_rq_fair()
  Documentation, sched-arch.txt: fix the incorrect syntax

 Documentation/scheduler/sched-arch.txt |2 +-
 kernel/sched/fair.c|6 +++---
 kernel/sched/rt.c  |2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/5] sched, fair: fix the typo in select_idle_sibling()

2013-12-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 kernel/sched/fair.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a82ae0a..db23d71 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4191,7 +4191,7 @@ static int select_idle_sibling(struct task_struct *p, int 
target)
return i;
 
/*
-* Otherwise, iterate the domains and find an elegible idle cpu.
+* Otherwise, iterate the domains and find an eligible idle cpu.
 */
sd = rcu_dereference(per_cpu(sd_llc, target));
for_each_lower_domain(sd) {
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] sched, fair: fix the comment of select_task_rq_fair()

2013-12-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 kernel/sched/fair.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index db23d71..eaa1e91 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4218,7 +4218,7 @@ done:
 }
 
 /*
- * sched_balance_self: balance the current task (running on cpu) in domains
+ * select_task_rq_fair: balance the current task (running on cpu) in domains
  * that have the 'flag' flag set. In practice, this is SD_BALANCE_FORK and
  * SD_BALANCE_EXEC.
  *
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] sched, rt: move .switched_from out of the scope of CONFIG_SMP

2013-12-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

.switched_from shouldn't be initialized in the scope of CONFIG_SMP,
so this patch is trying to move it out.

Signed-off-by: Zhi Yong Wu 
---
 kernel/sched/rt.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 1c40655..f34d41b 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2002,9 +2002,9 @@ const struct sched_class rt_sched_class = {
.pre_schedule   = pre_schedule_rt,
.post_schedule  = post_schedule_rt,
.task_woken = task_woken_rt,
-   .switched_from  = switched_from_rt,
 #endif
 
+   .switched_from  = switched_from_rt,
.set_curr_task  = set_curr_task_rt,
.task_tick  = task_tick_rt,
 
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] Documentation, sched-arch.txt: fix the incorrect syntax

2013-12-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 Documentation/scheduler/sched-arch.txt |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/scheduler/sched-arch.txt 
b/Documentation/scheduler/sched-arch.txt
index 9290de7..0a7d252 100644
--- a/Documentation/scheduler/sched-arch.txt
+++ b/Documentation/scheduler/sched-arch.txt
@@ -21,7 +21,7 @@ CPU idle
 
 Your cpu_idle routines need to obey the following rules:
 
-1. Preempt should now disabled over idle routines. Should only
+1. Preempt should be now disabled over idle routines. Should only
be enabled to call schedule() then disabled again.
 
 2. need_resched/TIF_NEED_RESCHED is only ever set, and will never
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] sched, fair: fix the comment of move_tasks()

2013-12-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 kernel/sched/fair.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c7395d9..a82ae0a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4982,7 +4982,7 @@ static const unsigned int sched_nr_migrate_break = 32;
 /*
  * move_tasks tries to move up to imbalance weighted load from busiest to
  * this_rq, as part of a balancing operation within domain "sd".
- * Returns 1 if successful and 0 otherwise.
+ * Returns the number of moved tasks if successful and 0 otherwise.
  *
  * Called with both runqueues locked.
  */
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/5] Sched: Some trivial typo fixes

2013-12-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

*** BLURB HERE ***

Zhi Yong Wu (5):
  sched, rt: move .switched_from out of the scope of CONFIG_SMP
  sched, fair: fix the comment of move_tasks()
  sched, fair: fix the typo in select_idle_sibling()
  sched, fair: fix the comment of select_task_rq_fair()
  Documentation, sched-arch.txt: fix the incorrect syntax

 Documentation/scheduler/sched-arch.txt |2 +-
 kernel/sched/fair.c|6 +++---
 kernel/sched/rt.c  |2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH stable 2/2] virtio-net: make all RX paths handle erors consistently

2013-12-25 Thread Zhi Yong Wu
typo in the subject

s/erors/errors/

On Wed, Dec 25, 2013 at 10:56 PM, Michael S. Tsirkin  wrote:
> receive mergeable now handles errors internally.
> Do same for big and small packet paths, otherwise
> the logic is too hard to follow.
>
> Cc: Jason Wang 
> Cc: David S. Miller 
> Signed-off-by: Michael S. Tsirkin 
>
> (cherry picked from commit f121159d72091f25afb22007c833e60a6845e912)
> ---
>  drivers/net/virtio_net.c | 56 
> +++-
>  1 file changed, 36 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 435076f..c0ed6d5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -297,6 +297,34 @@ static struct sk_buff *page_to_skb(struct receive_queue 
> *rq,
> return skb;
>  }
>
> +static struct sk_buff *receive_small(void *buf, unsigned int len)
> +{
> +   struct sk_buff * skb = buf;
> +
> +   len -= sizeof(struct virtio_net_hdr);
> +   skb_trim(skb, len);
> +
> +   return skb;
> +}
> +
> +static struct sk_buff *receive_big(struct net_device *dev,
> +  struct receive_queue *rq,
> +  void *buf)
> +{
> +   struct page *page = buf;
> +   struct sk_buff *skb = page_to_skb(rq, page, 0);
> +
> +   if (unlikely(!skb))
> +   goto err;
> +
> +   return skb;
> +
> +err:
> +   dev->stats.rx_dropped++;
> +   give_pages(rq, page);
> +   return NULL;
> +}
> +
>  static struct sk_buff *receive_mergeable(struct net_device *dev,
>  struct receive_queue *rq,
>  void *buf,
> @@ -360,7 +388,6 @@ static void receive_buf(struct receive_queue *rq, void 
> *buf, unsigned int len)
> struct net_device *dev = vi->dev;
> struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
> struct sk_buff *skb;
> -   struct page *page;
> struct skb_vnet_hdr *hdr;
>
> if (unlikely(len < sizeof(struct virtio_net_hdr) + ETH_HLEN)) {
> @@ -372,26 +399,15 @@ static void receive_buf(struct receive_queue *rq, void 
> *buf, unsigned int len)
> dev_kfree_skb(buf);
> return;
> }
> +   if (vi->mergeable_rx_bufs)
> +   skb = receive_mergeable(dev, rq, buf, len);
> +   else if (vi->big_packets)
> +   skb = receive_big(dev, rq, buf);
> +   else
> +   skb = receive_small(buf, len);
>
> -   if (!vi->mergeable_rx_bufs && !vi->big_packets) {
> -   skb = buf;
> -   len -= sizeof(struct virtio_net_hdr);
> -   skb_trim(skb, len);
> -   } else {
> -   page = buf;
> -   if (vi->mergeable_rx_bufs) {
> -   skb = receive_mergeable(dev, rq, page, len);
> -   if (unlikely(!skb))
> -   return;
> -   } else {
> -   skb = page_to_skb(rq, page, len);
> -   if (unlikely(!skb)) {
> -   dev->stats.rx_dropped++;
> -   give_pages(rq, page);
> -   return;
> -   }
> -   }
> -   }
> +   if (unlikely(!skb))
> +   return;
>
> hdr = skb_vnet_hdr(skb);
>
> --
> MST
>
> ___
> Virtualization mailing list
> virtualizat...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Zhi Yong Wu
On Wed, Dec 18, 2013 at 12:58 PM, Tom Herbert  wrote:
>>> Yes , in it's current state it's broken. But maybe we can try to fix
>>> it instead of arbitrarily removing it. Please see my patches on
>>> plumbing RFS into tuntap which may start to make it useful.
>> Do you mean you patch [5/5] tun: Added support for RFS on tun flows?
>> Sorry, can you say with more details?
>
> Correct. It was RFC since I didn't have a good way to test, if you do
> please try it and see if there's any effect. We should also be able to
Interesting, i will try to dig it. Sorry, i don't understand why you
can't test. Does it require some special hardware support? or other
facilities?
> do something similar for KVM guests, either doing the flow lookup on
> each packet from the guest, or use aRFS interface from the guest
> driver for end to end RFS (more exciting prospect). We are finding
which two ends do you mean?
> that guest to driver accelerations like this (and tso, lro) are quite
Sorry, i got a bit confused, the driver here mean "virtio_net" or tuntap driver?
> important in getting virtual networking performance up.
>
>>
>>>
>>> Tom
>>>
>>>> Signed-off-by: Zhi Yong Wu 
>>>> ---
>>>>  drivers/net/tun.c |  208 
>>>> +++--
>>>>  1 files changed, 10 insertions(+), 198 deletions(-)
>>>>
>>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>>> index 7c8343a..7c27fdc 100644
>>>> --- a/drivers/net/tun.c
>>>> +++ b/drivers/net/tun.c
>>>> @@ -32,12 +32,15 @@
>>>>   *
>>>>   *  Daniel Podlejski 
>>>>   *Modifications for 2.3.99-pre5 kernel.
>>>> + *
>>>> + *  Zhi Yong Wu 
>>>> + *Remove the flow cache.
>>>>   */
>>>>
>>>>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>>>
>>>>  #define DRV_NAME   "tun"
>>>> -#define DRV_VERSION"1.6"
>>>> +#define DRV_VERSION"1.7"
>>>>  #define DRV_DESCRIPTION"Universal TUN/TAP device driver"
>>>>  #define DRV_COPYRIGHT  "(C) 1999-2004 Max Krasnyansky "
>>>>
>>>> @@ -146,18 +149,6 @@ struct tun_file {
>>>> struct tun_struct *detached;
>>>>  };
>>>>
>>>> -struct tun_flow_entry {
>>>> -   struct hlist_node hash_link;
>>>> -   struct rcu_head rcu;
>>>> -   struct tun_struct *tun;
>>>> -
>>>> -   u32 rxhash;
>>>> -   int queue_index;
>>>> -   unsigned long updated;
>>>> -};
>>>> -
>>>> -#define TUN_NUM_FLOW_ENTRIES 1024
>>>> -
>>>>  /* Since the socket were moved to tun_file, to preserve the behavior of 
>>>> persist
>>>>   * device, socket filter, sndbuf and vnet header size were restore when 
>>>> the
>>>>   * file were attached to a persist device.
>>>> @@ -184,163 +175,11 @@ struct tun_struct {
>>>> int debug;
>>>>  #endif
>>>> spinlock_t lock;
>>>> -   struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
>>>> -   struct timer_list flow_gc_timer;
>>>> -   unsigned long ageing_time;
>>>> unsigned int numdisabled;
>>>> struct list_head disabled;
>>>> void *security;
>>>> -   u32 flow_count;
>>>>  };
>>>>
>>>> -static inline u32 tun_hashfn(u32 rxhash)
>>>> -{
>>>> -   return rxhash & 0x3ff;
>>>> -}
>>>> -
>>>> -static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
>>>> rxhash)
>>>> -{
>>>> -   struct tun_flow_entry *e;
>>>> -
>>>> -   hlist_for_each_entry_rcu(e, head, hash_link) {
>>>> -   if (e->rxhash == rxhash)
>>>> -   return e;
>>>> -   }
>>>> -   return NULL;
>>>> -}
>>>> -
>>>> -static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
>>>> - struct hlist_head *head,
>>>> - u32 rxhash, u16 queue_index)
>>>> -{
>>>> -   struct tun_flow_entry *e = kmalloc(sizeof(*e), GFP_ATOMIC);
&g

Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Zhi Yong Wu
HI, Tom,

On Wed, Dec 18, 2013 at 12:06 PM, Tom Herbert  wrote:
> On Mon, Dec 16, 2013 at 11:26 PM, Zhi Yong Wu  wrote:
>> From: Zhi Yong Wu 
>>
>> The flow cache is an extremely broken concept, and it usually brings up
>> growth issues and DoS attacks, so this patch is trying to remove it from
>> the tuntap driver, and insteadly use a simpler way for its flow control.
>>
> Yes , in it's current state it's broken. But maybe we can try to fix
> it instead of arbitrarily removing it. Please see my patches on
> plumbing RFS into tuntap which may start to make it useful.
Do you mean you patch [5/5] tun: Added support for RFS on tun flows?
Sorry, can you say with more details?

>
> Tom
>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>  drivers/net/tun.c |  208 
>> +++--
>>  1 files changed, 10 insertions(+), 198 deletions(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 7c8343a..7c27fdc 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -32,12 +32,15 @@
>>   *
>>   *  Daniel Podlejski 
>>   *Modifications for 2.3.99-pre5 kernel.
>> + *
>> + *  Zhi Yong Wu 
>> + *Remove the flow cache.
>>   */
>>
>>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>
>>  #define DRV_NAME   "tun"
>> -#define DRV_VERSION"1.6"
>> +#define DRV_VERSION"1.7"
>>  #define DRV_DESCRIPTION"Universal TUN/TAP device driver"
>>  #define DRV_COPYRIGHT  "(C) 1999-2004 Max Krasnyansky "
>>
>> @@ -146,18 +149,6 @@ struct tun_file {
>> struct tun_struct *detached;
>>  };
>>
>> -struct tun_flow_entry {
>> -   struct hlist_node hash_link;
>> -   struct rcu_head rcu;
>> -   struct tun_struct *tun;
>> -
>> -   u32 rxhash;
>> -   int queue_index;
>> -   unsigned long updated;
>> -};
>> -
>> -#define TUN_NUM_FLOW_ENTRIES 1024
>> -
>>  /* Since the socket were moved to tun_file, to preserve the behavior of 
>> persist
>>   * device, socket filter, sndbuf and vnet header size were restore when the
>>   * file were attached to a persist device.
>> @@ -184,163 +175,11 @@ struct tun_struct {
>> int debug;
>>  #endif
>> spinlock_t lock;
>> -   struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
>> -   struct timer_list flow_gc_timer;
>> -   unsigned long ageing_time;
>> unsigned int numdisabled;
>> struct list_head disabled;
>> void *security;
>> -   u32 flow_count;
>>  };
>>
>> -static inline u32 tun_hashfn(u32 rxhash)
>> -{
>> -   return rxhash & 0x3ff;
>> -}
>> -
>> -static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
>> rxhash)
>> -{
>> -   struct tun_flow_entry *e;
>> -
>> -   hlist_for_each_entry_rcu(e, head, hash_link) {
>> -   if (e->rxhash == rxhash)
>> -   return e;
>> -   }
>> -   return NULL;
>> -}
>> -
>> -static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
>> - struct hlist_head *head,
>> - u32 rxhash, u16 queue_index)
>> -{
>> -   struct tun_flow_entry *e = kmalloc(sizeof(*e), GFP_ATOMIC);
>> -
>> -   if (e) {
>> -   tun_debug(KERN_INFO, tun, "create flow: hash %u index %u\n",
>> - rxhash, queue_index);
>> -   e->updated = jiffies;
>> -   e->rxhash = rxhash;
>> -   e->queue_index = queue_index;
>> -   e->tun = tun;
>> -   hlist_add_head_rcu(&e->hash_link, head);
>> -   ++tun->flow_count;
>> -   }
>> -   return e;
>> -}
>> -
>> -static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry 
>> *e)
>> -{
>> -   tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n",
>> - e->rxhash, e->queue_index);
>> -   hlist_del_rcu(&e->hash_link);
>> -   kfree_rcu(e, rcu);
>> -   --tun->flow_count;
>> -}
>> -
>> -static void tun_flow_flush(struct tun_struct *tun)
>> -{
>> -   int i;
>> -
>> -   spin_lock_bh(&tun->lock);
>> -   for (i = 0; i < TUN_NUM_FL

Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Zhi Yong Wu
On Tue, Dec 17, 2013 at 6:05 PM, Jason Wang  wrote:
> On 12/17/2013 05:13 PM, Zhi Yong Wu wrote:
>> On Tue, Dec 17, 2013 at 4:49 PM, Jason Wang  wrote:
>>> > On 12/17/2013 03:26 PM, Zhi Yong Wu wrote:
>>>> >> From: Zhi Yong Wu 
>>>> >>
>>>> >> The flow cache is an extremely broken concept, and it usually brings up
>>>> >> growth issues and DoS attacks, so this patch is trying to remove it from
>>>> >> the tuntap driver, and insteadly use a simpler way for its flow control.
>>> >
>>> > NACK.
>>> >
>>> > This single function revert does not make sense to me. Since:
>> IIRC, the tuntap flow cache is only used to save the mapping of skb
>> packet <-> queue index. My idea only save the queue index in skb_buff
>> early when skb buffer is filled, not in flow cache as the current
>> code. This method is actually more simpler and completely doesn't need
>> any flow cache.
>
> Nope. Flow caches record the flow to queues mapping like what most
> multiqueue nic does. The only difference is tun record it silently while
> most nic needs driver to tell the mapping.
Just check virtio specs, i seem to miss the fact that flow cache
enable packet steering in mq mode, thanks for your comments. But i
have some concerns about some of your comments.
>
> What your patch does is:
> - set the queue mapping of skb during tun_get_user(). But most drivers
> using XPS or processor id to select the real tx queue. So the real txq
> depends on the cpu that vhost or qemu is running. This setting does not
Doesn't those drivers invoke netdev_pick_tx() or its counterpart to
select real tx queue? e.g. tun_select_queue(). or can you say it with
an example?
Moreover, how do those drivers know which cpu vhost or qemu is running on?
> have any effect in fact.
> - the queue mapping of skb were fetched during tun_select_queue(). This
> value is usually set by a multiqueue nic to record which hardware rxq
> was this packet came.
ah? Can you let me know where a mq nic controller set it?
>
> Can you explain how your patch works exactly?
You have understood it.
>>> >
>>> > - You in fact removes the flow steering function in tun. We definitely
>>> > need something like this to unbreak the TCP performance in a multiqueue
>> I don't think it will downgrade the TCP perf even in mq guest, but my
>> idea maybe has better TCP perf, because it doesn't have any cache
>> table lookup, etc.
>
> Did you test and compare the performance numbers? Did you run profiler
> to see how much does the lookup cost?
No, As i jus said above, i miss that flow cache can enable packet
steering. But Did you do related perf testing? To be honest, i am
wondering how much perf the packet steering can improve. Actually it
also injects a lot of cache lookup cost.
>>> > guest. Please have a look at the virtio-net driver / virtio sepc for
>>> > more information.
>>> > - The total number of flow caches were limited to 4096, so there's no
>>> > DoS or growth issue.
>> Can you check why the ipv4 routing cache is removed? maybe i miss
>> something, if yes, pls correct me. :)
>
> The main differences is that the flow caches were best effort. Tun can
> not store all flows to queue mapping, and even a hardware nic can not do
> this. If a packet misses the flow cache, it's safe to distribute it
> randomly or through another method. So the limitation just work.
Exactly, we can know this from tun_select_queue().
>
> Could you please explain the DoS or growth issue you meet here?
>>> > - The only issue is scalability, but fixing this is not easy. We can
>>> > just use arrays/indirection table like RSS instead of hash buckets, it
>>> > saves some time in linear search but has other issues like collision
>>> > - I've also had a RFC of using aRFS in the past, it also has several
>>> > drawbacks such as busy looping in the networking hotspot.
>>> >
>>> > So in conclusion, we need flow steering in tun, just removing current
>>> > method does not help. The proper way is to expose several different
>>> > methods to user and let user to choose the preferable mechanism like
>>> > packet.
>> By the way, let us look at what other networking guys think of this,
>> such as MST, dave, etc. :)
>>
>
> Of course.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Zhi Yong Wu
On Tue, Dec 17, 2013 at 4:49 PM, Jason Wang  wrote:
> On 12/17/2013 03:26 PM, Zhi Yong Wu wrote:
>> From: Zhi Yong Wu 
>>
>> The flow cache is an extremely broken concept, and it usually brings up
>> growth issues and DoS attacks, so this patch is trying to remove it from
>> the tuntap driver, and insteadly use a simpler way for its flow control.
>
> NACK.
>
> This single function revert does not make sense to me. Since:
IIRC, the tuntap flow cache is only used to save the mapping of skb
packet <-> queue index. My idea only save the queue index in skb_buff
early when skb buffer is filled, not in flow cache as the current
code. This method is actually more simpler and completely doesn't need
any flow cache.

>
> - You in fact removes the flow steering function in tun. We definitely
> need something like this to unbreak the TCP performance in a multiqueue
I don't think it will downgrade the TCP perf even in mq guest, but my
idea maybe has better TCP perf, because it doesn't have any cache
table lookup, etc.
> guest. Please have a look at the virtio-net driver / virtio sepc for
> more information.
> - The total number of flow caches were limited to 4096, so there's no
> DoS or growth issue.
Can you check why the ipv4 routing cache is removed? maybe i miss
something, if yes, pls correct me. :)
> - The only issue is scalability, but fixing this is not easy. We can
> just use arrays/indirection table like RSS instead of hash buckets, it
> saves some time in linear search but has other issues like collision
> - I've also had a RFC of using aRFS in the past, it also has several
> drawbacks such as busy looping in the networking hotspot.
>
> So in conclusion, we need flow steering in tun, just removing current
> method does not help. The proper way is to expose several different
> methods to user and let user to choose the preferable mechanism like
> packet.
By the way, let us look at what other networking guys think of this,
such as MST, dave, etc. :)

>
>>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>  drivers/net/tun.c |  208 
>> +++--
>>  1 files changed, 10 insertions(+), 198 deletions(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 7c8343a..7c27fdc 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -32,12 +32,15 @@
>>   *
>>   *  Daniel Podlejski 
>>   *Modifications for 2.3.99-pre5 kernel.
>> + *
>> + *  Zhi Yong Wu 
>> + *Remove the flow cache.
>>   */
>>
>>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>
>>  #define DRV_NAME "tun"
>> -#define DRV_VERSION  "1.6"
>> +#define DRV_VERSION  "1.7"
>>  #define DRV_DESCRIPTION  "Universal TUN/TAP device driver"
>>  #define DRV_COPYRIGHT"(C) 1999-2004 Max Krasnyansky 
>> "
>>
>> @@ -146,18 +149,6 @@ struct tun_file {
>>   struct tun_struct *detached;
>>  };
>>
>> -struct tun_flow_entry {
>> - struct hlist_node hash_link;
>> - struct rcu_head rcu;
>> - struct tun_struct *tun;
>> -
>> - u32 rxhash;
>> - int queue_index;
>> - unsigned long updated;
>> -};
>> -
>> -#define TUN_NUM_FLOW_ENTRIES 1024
>> -
>>  /* Since the socket were moved to tun_file, to preserve the behavior of 
>> persist
>>   * device, socket filter, sndbuf and vnet header size were restore when the
>>   * file were attached to a persist device.
>> @@ -184,163 +175,11 @@ struct tun_struct {
>>   int debug;
>>  #endif
>>   spinlock_t lock;
>> - struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
>> - struct timer_list flow_gc_timer;
>> - unsigned long ageing_time;
>>   unsigned int numdisabled;
>>   struct list_head disabled;
>>   void *security;
>> - u32 flow_count;
>>  };
>>
>> -static inline u32 tun_hashfn(u32 rxhash)
>> -{
>> - return rxhash & 0x3ff;
>> -}
>> -
>> -static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
>> rxhash)
>> -{
>> - struct tun_flow_entry *e;
>> -
>> - hlist_for_each_entry_rcu(e, head, hash_link) {
>> - if (e->rxhash == rxhash)
>> - return e;
>> - }
>> - return NULL;
>> -}
>> -
>> -static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
>> -   struct hlist_head *head,
>> -  

Re: [RFC PATCH] net, tun: remove the flow cache

2013-12-17 Thread Zhi Yong Wu
On Mon, 2013-12-16 at 23:47 -0800, Stephen Hemminger wrote:
> On Tue, 17 Dec 2013 15:26:22 +0800
> Zhi Yong Wu  wrote:
> 
> > From: Zhi Yong Wu 
> > 
> > The flow cache is an extremely broken concept, and it usually brings up
> > growth issues and DoS attacks, so this patch is trying to remove it from
> > the tuntap driver, and insteadly use a simpler way for its flow control.
> > 
> > Signed-off-by: Zhi Yong Wu 
> > ---
> >  drivers/net/tun.c |  208 
> > +++--
> >  1 files changed, 10 insertions(+), 198 deletions(-)
> > 
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index 7c8343a..7c27fdc 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -32,12 +32,15 @@
> >   *
> >   *  Daniel Podlejski 
> >   *Modifications for 2.3.99-pre5 kernel.
> > + *
> > + *  Zhi Yong Wu 
> > + *Remove the flow cache.
> >   */
> 
> I agree with your patch, but please don't add to the comment changelog.
> These are all historical. The kernel development process has not used
> them for 5+ years.
> 
> Can we get kernel janitors to just remove them, or would that step
> on too many early developers toes by removing credit?
I thought that it is a big code change, and need to add some changelog
for this, but you seem to have a big argue. :) I don't object to
removing my comment in its changelog if other guys also agree with you.


> 

-- 
Regards,

Zhi Yong Wu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH] net, tun: remove the flow cache

2013-12-16 Thread Zhi Yong Wu
From: Zhi Yong Wu 

The flow cache is an extremely broken concept, and it usually brings up
growth issues and DoS attacks, so this patch is trying to remove it from
the tuntap driver, and insteadly use a simpler way for its flow control.

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |  208 +++--
 1 files changed, 10 insertions(+), 198 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7c8343a..7c27fdc 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -32,12 +32,15 @@
  *
  *  Daniel Podlejski 
  *Modifications for 2.3.99-pre5 kernel.
+ *
+ *  Zhi Yong Wu 
+ *Remove the flow cache.
  */
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #define DRV_NAME   "tun"
-#define DRV_VERSION"1.6"
+#define DRV_VERSION"1.7"
 #define DRV_DESCRIPTION"Universal TUN/TAP device driver"
 #define DRV_COPYRIGHT  "(C) 1999-2004 Max Krasnyansky "
 
@@ -146,18 +149,6 @@ struct tun_file {
struct tun_struct *detached;
 };
 
-struct tun_flow_entry {
-   struct hlist_node hash_link;
-   struct rcu_head rcu;
-   struct tun_struct *tun;
-
-   u32 rxhash;
-   int queue_index;
-   unsigned long updated;
-};
-
-#define TUN_NUM_FLOW_ENTRIES 1024
-
 /* Since the socket were moved to tun_file, to preserve the behavior of persist
  * device, socket filter, sndbuf and vnet header size were restore when the
  * file were attached to a persist device.
@@ -184,163 +175,11 @@ struct tun_struct {
int debug;
 #endif
spinlock_t lock;
-   struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
-   struct timer_list flow_gc_timer;
-   unsigned long ageing_time;
unsigned int numdisabled;
struct list_head disabled;
void *security;
-   u32 flow_count;
 };
 
-static inline u32 tun_hashfn(u32 rxhash)
-{
-   return rxhash & 0x3ff;
-}
-
-static struct tun_flow_entry *tun_flow_find(struct hlist_head *head, u32 
rxhash)
-{
-   struct tun_flow_entry *e;
-
-   hlist_for_each_entry_rcu(e, head, hash_link) {
-   if (e->rxhash == rxhash)
-   return e;
-   }
-   return NULL;
-}
-
-static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun,
- struct hlist_head *head,
- u32 rxhash, u16 queue_index)
-{
-   struct tun_flow_entry *e = kmalloc(sizeof(*e), GFP_ATOMIC);
-
-   if (e) {
-   tun_debug(KERN_INFO, tun, "create flow: hash %u index %u\n",
- rxhash, queue_index);
-   e->updated = jiffies;
-   e->rxhash = rxhash;
-   e->queue_index = queue_index;
-   e->tun = tun;
-   hlist_add_head_rcu(&e->hash_link, head);
-   ++tun->flow_count;
-   }
-   return e;
-}
-
-static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry *e)
-{
-   tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n",
- e->rxhash, e->queue_index);
-   hlist_del_rcu(&e->hash_link);
-   kfree_rcu(e, rcu);
-   --tun->flow_count;
-}
-
-static void tun_flow_flush(struct tun_struct *tun)
-{
-   int i;
-
-   spin_lock_bh(&tun->lock);
-   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
-   struct tun_flow_entry *e;
-   struct hlist_node *n;
-
-   hlist_for_each_entry_safe(e, n, &tun->flows[i], hash_link)
-   tun_flow_delete(tun, e);
-   }
-   spin_unlock_bh(&tun->lock);
-}
-
-static void tun_flow_delete_by_queue(struct tun_struct *tun, u16 queue_index)
-{
-   int i;
-
-   spin_lock_bh(&tun->lock);
-   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
-   struct tun_flow_entry *e;
-   struct hlist_node *n;
-
-   hlist_for_each_entry_safe(e, n, &tun->flows[i], hash_link) {
-   if (e->queue_index == queue_index)
-   tun_flow_delete(tun, e);
-   }
-   }
-   spin_unlock_bh(&tun->lock);
-}
-
-static void tun_flow_cleanup(unsigned long data)
-{
-   struct tun_struct *tun = (struct tun_struct *)data;
-   unsigned long delay = tun->ageing_time;
-   unsigned long next_timer = jiffies + delay;
-   unsigned long count = 0;
-   int i;
-
-   tun_debug(KERN_INFO, tun, "tun_flow_cleanup\n");
-
-   spin_lock_bh(&tun->lock);
-   for (i = 0; i < TUN_NUM_FLOW_ENTRIES; i++) {
-   struct tun_flow_entry *e;
-   struct hlist_node *n;
-
-   hlist_for_each_entry_safe(e, n, &tun->flows[i], hash_link) {
-   unsigned long this_timer;
-   count++;
-

Re: [PATCH 1/5] xfs: factor prid related codes into xfs_get_initial_prid()

2013-12-14 Thread Zhi Yong Wu
On Sat, Dec 14, 2013 at 7:20 PM, Jeff Liu  wrote:
> On 12/14 2013 00:32 AM, Christoph Hellwig wrote:
>>> +static inline prid_t xfs_get_initial_prid(struct xfs_inode *dp)
>>> +{
>>> +if (dp->i_d.di_flags & XFS_DIFLAG_PROJINHERIT)
>>> +return xfs_get_projid(dp);
>>> +else
>>> +return XFS_PROJID_DEFAULT;
>>> +}
>>
>> You could skip the else here.
> Except that, I'd suggest we move this helper to proper header file with
> further refactoring in xfs_symlink(), and it could be a separate patch.
Good point, will apply it, thanks.

>
> Thanks,
> -Jeff



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] xfs: allow linkat() on O_TMPFILE files

2013-12-14 Thread Zhi Yong Wu
On Sat, Dec 14, 2013 at 4:19 PM, Dave Chinner  wrote:
> On Sat, Dec 14, 2013 at 01:36:47AM +0800, Zhi Yong Wu wrote:
>> On Sat, Dec 14, 2013 at 12:41 AM, Christoph Hellwig  
>> wrote:
>> > On Fri, Dec 13, 2013 at 10:27:53PM +0800, Zhi Yong Wu wrote:
>> >> From: Zhi Yong Wu 
>> >>
>> >> Enable O_TMPFILE support in linkat().
>> >> For more info, please refer to:
>> >>   http://oss.sgi.com/archives/xfs/2013-08/msg00341.html
>> >
>> > Generall you should provide all reasonable information in the changelog
>> > instead of linking to it.
>> will apply this, thanks.
>> >
>> >> + if (sip->i_d.di_nlink == 0)
>> >> + tres = &M_RES(mp)->tr_link_tmpfile;
>> >> + else
>> >> + tres = &M_RES(mp)->tr_link;
>> >
>> > As mentioned before I think Dave wanted you to always use the same
>> > reservation, but I'll leave that discussion to him.
>> If as you said, when some tons of regular files are created, it won't
>> waste some disk space? e.g. some files want to reserve some space, but
>> get NOSPACE due to other files reserving additional space?
>
> This is a log space reservation, not a disk space reservation. End
> either way, what is unused by the transaction is returned to the
> free space pool at the end of the transaction. So for simplicity,
> we should just use the one reservation for the link transaction -
> take whichever is larger at calculation time.
Good explaination, thanks Dave and Christoph. By the way, can you help
check if the log reservation for adding/removing one inode to/from
unlinked list is correct? or  will you check after i post next version
out?

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] xfs: allow linkat() on O_TMPFILE files

2013-12-13 Thread Zhi Yong Wu
On Sat, Dec 14, 2013 at 12:41 AM, Christoph Hellwig  wrote:
> On Fri, Dec 13, 2013 at 10:27:53PM +0800, Zhi Yong Wu wrote:
>> From: Zhi Yong Wu 
>>
>> Enable O_TMPFILE support in linkat().
>> For more info, please refer to:
>>   http://oss.sgi.com/archives/xfs/2013-08/msg00341.html
>
> Generall you should provide all reasonable information in the changelog
> instead of linking to it.
will apply this, thanks.
>
>> + if (sip->i_d.di_nlink == 0)
>> + tres = &M_RES(mp)->tr_link_tmpfile;
>> + else
>> + tres = &M_RES(mp)->tr_link;
>
> As mentioned before I think Dave wanted you to always use the same
> reservation, but I'll leave that discussion to him.
If as you said, when some tons of regular files are created, it won't
waste some disk space? e.g. some files want to reserve some space, but
get NOSPACE due to other files reserving additional space?

>
>> +/* For creating a link to an O_TMPFILE inode, except modifying
>> + * those metadata for regular inode, we still need to remove an inode
>> + * from unlinked list at first. That is,  we can modify:
>> + *the agi hash list and counters: sector size
>> + *the on disk inode before ours in the agi hash list: inode cluster size
>> + */
>
> We always have an emptry content
Done, thanks.
>
> /*
>
> line at the beginning of comments in XFS and the Linux kernel in
> general.
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] xfs: add a new method xfs_vn_tmpfile()

2013-12-13 Thread Zhi Yong Wu
On Sat, Dec 14, 2013 at 12:39 AM, Christoph Hellwig  wrote:
> On Fri, Dec 13, 2013 at 10:27:52PM +0800, Zhi Yong Wu wrote:
>> From: Zhi Yong Wu 
>>
>> Add a new O_TMPFILE method to VFS inteface.
>> For more info, please refer to:
>>   http://oss.sgi.com/archives/xfs/2013-08/msg00336.html
>>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>  fs/xfs/xfs_iops.c |   22 ++
>>  1 files changed, 22 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>> index eb55be5..b57cd89 100644
>> --- a/fs/xfs/xfs_iops.c
>> +++ b/fs/xfs/xfs_iops.c
>> @@ -39,6 +39,7 @@
>>  #include "xfs_da_btree.h"
>>  #include "xfs_dir2_priv.h"
>>  #include "xfs_dinode.h"
>> +#include "xfs_trans_space.h"
>>
>>  #include 
>>  #include 
>> @@ -1051,6 +1052,25 @@ xfs_vn_fiemap(
>>   return 0;
>>  }
>>
>> +STATIC int
>> +xfs_vn_tmpfile(
>> + struct inode*dir,
>> + struct dentry   *dentry,
>> + umode_t mode)
>> +{
>> + struct xfs_inode *ip = NULL;
>> + int error;
>> +
>> + error = xfs_create_tmpfile(XFS_I(dir), XFS_I(dir)->i_mount,
>
> No need to pass in the mount point here, the client can get it easily.
>
>> + mode, 0, &ip);
>
> Also no need for an always-zero argument.
Fixed, thanks.
>
>> + if (error)
>> + return -error;
>> +
>> + d_instantiate(dentry, VFS_I(ip));
>
> Shouldn't this be a call to d_tmpfile() instead?
Yes, then it need to be called in xfs_create_tmpfile() just before
xfs_iunlink() is called.
>
> Also I'd suggest mergin this into the previous patch, so that we have
> one that actually adds O_TMPFILE support, and once place to write a nice
Merged them, thanks.
> good changelog.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/5] xfs: add xfs_create_tmpfile() for O_TMPFILE support

2013-12-13 Thread Zhi Yong Wu
Fixed them, thanks.

On Sat, Dec 14, 2013 at 12:37 AM, Christoph Hellwig  wrote:
>> + error = xfs_dir_ialloc(&tp, NULL, mode, 0, rdev,
>
> please pass the parent inode pointer here.
>
>> + XFS_PROJID_DEFAULT, resblks > 0,
>
> and pass the project id that you inherited from the parent here.
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] xfs: adjust the interface of xfs_qm_vop_dqalloc()

2013-12-13 Thread Zhi Yong Wu
On Sat, Dec 14, 2013 at 12:32 AM, Christoph Hellwig  wrote:
> On Fri, Dec 13, 2013 at 10:27:50PM +0800, Zhi Yong Wu wrote:
>> From: Zhi Yong Wu 
>>
>> There may be not a parent inode or a name for O_TMPFILE support, but will 
>> pass
>> a struct xfs_mount to xfs_qm_vop_dqalloc(). So its interface need to be
>> adjusted in order that O_TMPFILE creation function can also use it.
>>
>> Signed-off-by: Zhi Yong Wu 
>
> This patch is not actually needed, as we do get passed a parent.
Discarded, thanks.

>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] xfs: adjust the interface of xfs_qm_vop_dqalloc()

2013-12-13 Thread Zhi Yong Wu
From: Zhi Yong Wu 

There may be not a parent inode or a name for O_TMPFILE support, but will pass
a struct xfs_mount to xfs_qm_vop_dqalloc(). So its interface need to be
adjusted in order that O_TMPFILE creation function can also use it.

Signed-off-by: Zhi Yong Wu 
---
 fs/xfs/xfs_inode.c   |2 +-
 fs/xfs/xfs_ioctl.c   |2 +-
 fs/xfs/xfs_iops.c|3 ++-
 fs/xfs/xfs_qm.c  |   50 +++---
 fs/xfs/xfs_quota.h   |6 --
 fs/xfs/xfs_symlink.c |2 +-
 6 files changed, 40 insertions(+), 25 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index e8b9a68..71a8186 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1182,7 +1182,7 @@ xfs_create(
/*
 * Make sure that we have allocated dquot(s) on disk.
 */
-   error = xfs_qm_vop_dqalloc(dp, xfs_kuid_to_uid(current_fsuid()),
+   error = xfs_qm_vop_dqalloc(dp, mp, xfs_kuid_to_uid(current_fsuid()),
xfs_kgid_to_gid(current_fsgid()), prid,
XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT,
&udqp, &gdqp, &pdqp);
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 33ad9a7..eac84bd 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1090,7 +1090,7 @@ xfs_ioctl_setattr(
 * because the i_*dquot fields will get updated anyway.
 */
if (XFS_IS_QUOTA_ON(mp) && (mask & FSX_PROJID)) {
-   code = xfs_qm_vop_dqalloc(ip, ip->i_d.di_uid,
+   code = xfs_qm_vop_dqalloc(ip, ip->i_mount, ip->i_d.di_uid,
 ip->i_d.di_gid, fa->fsx_projid,
 XFS_QMOPT_PQUOTA, &udqp, NULL, &pdqp);
if (code)
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 27e0e54..eb55be5 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -540,7 +540,8 @@ xfs_setattr_nonsize(
 */
ASSERT(udqp == NULL);
ASSERT(gdqp == NULL);
-   error = xfs_qm_vop_dqalloc(ip, xfs_kuid_to_uid(uid),
+   error = xfs_qm_vop_dqalloc(ip, ip->i_mount,
+  xfs_kuid_to_uid(uid),
   xfs_kgid_to_gid(gid),
   xfs_get_projid(ip),
   qflags, &udqp, &gdqp, NULL);
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 14a4996..1f13e82 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -1765,6 +1765,7 @@ xfs_qm_write_sb_changes(
 int
 xfs_qm_vop_dqalloc(
struct xfs_inode*ip,
+   struct xfs_mount*mp,
xfs_dqid_t  uid,
xfs_dqid_t  gid,
prid_t  prid,
@@ -1773,7 +1774,6 @@ xfs_qm_vop_dqalloc(
struct xfs_dquot**O_gdqpp,
struct xfs_dquot**O_pdqpp)
 {
-   struct xfs_mount*mp = ip->i_mount;
struct xfs_dquot*uq = NULL;
struct xfs_dquot*gq = NULL;
struct xfs_dquot*pq = NULL;
@@ -1783,17 +1783,19 @@ xfs_qm_vop_dqalloc(
if (!XFS_IS_QUOTA_RUNNING(mp) || !XFS_IS_QUOTA_ON(mp))
return 0;
 
-   lockflags = XFS_ILOCK_EXCL;
-   xfs_ilock(ip, lockflags);
+   if (ip) {
+   lockflags = XFS_ILOCK_EXCL;
+   xfs_ilock(ip, lockflags);
 
-   if ((flags & XFS_QMOPT_INHERIT) && XFS_INHERIT_GID(ip))
-   gid = ip->i_d.di_gid;
+   if ((flags & XFS_QMOPT_INHERIT) && XFS_INHERIT_GID(ip))
+   gid = ip->i_d.di_gid;
+   }
 
/*
 * Attach the dquot(s) to this inode, doing a dquot allocation
 * if necessary. The dquot(s) will not be locked.
 */
-   if (XFS_NOT_DQATTACHED(mp, ip)) {
+   if (ip && XFS_NOT_DQATTACHED(mp, ip)) {
error = xfs_qm_dqattach_locked(ip, XFS_QMOPT_DQALLOC);
if (error) {
xfs_iunlock(ip, lockflags);
@@ -1802,7 +1804,7 @@ xfs_qm_vop_dqalloc(
}
 
if ((flags & XFS_QMOPT_UQUOTA) && XFS_IS_UQUOTA_ON(mp)) {
-   if (ip->i_d.di_uid != uid) {
+   if (ip || (ip->i_d.di_uid != uid)) {
/*
 * What we need is the dquot that has this uid, and
 * if we send the inode to dqget, the uid of the inode
@@ -1812,7 +1814,8 @@ xfs_qm_vop_dqalloc(
 * we'll deadlock by doing trans_reserve while
 * holding ilock.
 */
-   xfs_iunlock(ip, lockflags);
+   if (ip)
+   xfs_iunlock(ip, lockflags)

[PATCH 0/5] xfs: add O_TMPFILE support

2013-12-13 Thread Zhi Yong Wu
From: Zhi Yong Wu 

HI, folks

  It's time to post out the first formal version, welcome to any constructive 
comment, thanks.

  If anyone is interested in playing with it, you can get this patchset from my 
dev git on github:
  git://github.com/wuzhy/kernel.git xfs_tmpfile

  The patchset was tests agaist the code snippet from Andy Lutomirski and other 
test cases:
  http://lwn.net/Articles/562296/
  If you have any other better test cases, please let me know, thanks.

#include 
#include 
#include 
#include 
#include 

#define __O_TMPFILE 02000
#define O_DIRECTORY 020
#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
#define AT_EMPTY_PATH 0x1000

int main(int argc, char **argv)
{
   char buf[128];

   if (argc != 3)
 errx(1, "Usage: flinktest PATH linkat|proc");

   int fd = open(".", O_TMPFILE | O_RDWR, 0600);
   if (fd == -1)
 err(1, "O_TMPFILE");
   else
 printf("fd #: %d\n", fd);

   write(fd, "test", 4);

   if (!strcmp(argv[2], "linkat")) {
 if (linkat(fd, "", AT_FDCWD, argv[1], AT_EMPTY_PATH) != 0)
   err(1, "linkat");
   } else if (!strcmp(argv[2], "proc")) {
 sprintf(buf, "/proc/self/fd/%d", fd);
 if (linkat(AT_FDCWD, buf, AT_FDCWD, argv[1], AT_SYMLINK_FOLLOW) != 0)
   err(1, "linkat");
   } else {
 errx(1, "invalid mode");
   }

   return 0;
}


Changelog from rfc:
 - Addressed the comments from Dave Chinner and Christoph Hellwig.

Zhi Yong Wu (5):
  xfs: factor prid related codes into xfs_get_initial_prid()
  xfs: adjust the interface of xfs_qm_vop_dqalloc()
  xfs: add xfs_create_tmpfile() for O_TMPFILE support
  xfs: add a new method xfs_vn_tmpfile()
  xfs: allow linkat() on O_TMPFILE files

 fs/xfs/xfs_inode.c  |  142 ---
 fs/xfs/xfs_inode.h  |2 +
 fs/xfs/xfs_ioctl.c  |2 +-
 fs/xfs/xfs_iops.c   |   25 -
 fs/xfs/xfs_qm.c |   50 ++--
 fs/xfs/xfs_quota.h  |6 +-
 fs/xfs/xfs_shared.h |4 +-
 fs/xfs/xfs_symlink.c|2 +-
 fs/xfs/xfs_trans_resv.c |   51 +
 fs/xfs/xfs_trans_resv.h |4 +
 10 files changed, 255 insertions(+), 33 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] xfs: factor prid related codes into xfs_get_initial_prid()

2013-12-13 Thread Zhi Yong Wu
From: Zhi Yong Wu 

It will be reused by the O_TMPFILE creation function.

Signed-off-by: Zhi Yong Wu 
---
 fs/xfs/xfs_inode.c |   13 +
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 001aa89..e8b9a68 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1139,6 +1139,14 @@ xfs_bumplink(
return 0;
 }
 
+static inline prid_t xfs_get_initial_prid(struct xfs_inode *dp)
+{
+   if (dp->i_d.di_flags & XFS_DIFLAG_PROJINHERIT)
+   return xfs_get_projid(dp);
+   else
+   return XFS_PROJID_DEFAULT;
+}
+
 int
 xfs_create(
xfs_inode_t *dp,
@@ -1169,10 +1177,7 @@ xfs_create(
if (XFS_FORCED_SHUTDOWN(mp))
return XFS_ERROR(EIO);
 
-   if (dp->i_d.di_flags & XFS_DIFLAG_PROJINHERIT)
-   prid = xfs_get_projid(dp);
-   else
-   prid = XFS_PROJID_DEFAULT;
+   prid = xfs_get_initial_prid(dp);
 
/*
 * Make sure that we have allocated dquot(s) on disk.
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] xfs: add a new method xfs_vn_tmpfile()

2013-12-13 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Add a new O_TMPFILE method to VFS inteface.
For more info, please refer to:
  http://oss.sgi.com/archives/xfs/2013-08/msg00336.html

Signed-off-by: Zhi Yong Wu 
---
 fs/xfs/xfs_iops.c |   22 ++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index eb55be5..b57cd89 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -39,6 +39,7 @@
 #include "xfs_da_btree.h"
 #include "xfs_dir2_priv.h"
 #include "xfs_dinode.h"
+#include "xfs_trans_space.h"
 
 #include 
 #include 
@@ -1051,6 +1052,25 @@ xfs_vn_fiemap(
return 0;
 }
 
+STATIC int
+xfs_vn_tmpfile(
+   struct inode*dir,
+   struct dentry   *dentry,
+   umode_t mode)
+{
+   struct xfs_inode *ip = NULL;
+   int error;
+
+   error = xfs_create_tmpfile(XFS_I(dir), XFS_I(dir)->i_mount,
+   mode, 0, &ip);
+   if (error)
+   return -error;
+
+   d_instantiate(dentry, VFS_I(ip));
+
+   return -error;
+}
+
 static const struct inode_operations xfs_inode_operations = {
.get_acl= xfs_get_acl,
.getattr= xfs_vn_getattr,
@@ -1087,6 +1107,7 @@ static const struct inode_operations 
xfs_dir_inode_operations = {
.removexattr= generic_removexattr,
.listxattr  = xfs_vn_listxattr,
.update_time= xfs_vn_update_time,
+   .tmpfile= xfs_vn_tmpfile,
 };
 
 static const struct inode_operations xfs_dir_ci_inode_operations = {
@@ -1113,6 +1134,7 @@ static const struct inode_operations 
xfs_dir_ci_inode_operations = {
.removexattr= generic_removexattr,
.listxattr  = xfs_vn_listxattr,
.update_time= xfs_vn_update_time,
+   .tmpfile= xfs_vn_tmpfile,
 };
 
 static const struct inode_operations xfs_symlink_inode_operations = {
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] xfs: allow linkat() on O_TMPFILE files

2013-12-13 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Enable O_TMPFILE support in linkat().
For more info, please refer to:
  http://oss.sgi.com/archives/xfs/2013-08/msg00341.html

Signed-off-by: Zhi Yong Wu 
---
 fs/xfs/xfs_inode.c  |   21 ++---
 fs/xfs/xfs_trans_resv.c |   20 
 fs/xfs/xfs_trans_resv.h |2 ++
 3 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 48e09c5..2e1fd96 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -62,6 +62,8 @@ kmem_zone_t *xfs_inode_zone;
 
 STATIC int xfs_iflush_int(xfs_inode_t *, xfs_buf_t *);
 
+STATIC int xfs_iunlink_remove(xfs_trans_t *, xfs_inode_t *);
+
 /*
  * helper function to extract extent size hint from inode
  */
@@ -1119,7 +1121,7 @@ xfs_bumplink(
 {
xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
 
-   ASSERT(ip->i_d.di_nlink > 0);
+   ASSERT(ip->i_d.di_nlink > 0 || (VFS_I(ip)->i_state & I_LINKABLE));
ip->i_d.di_nlink++;
inc_nlink(VFS_I(ip));
if ((ip->i_d.di_version == 1) &&
@@ -1455,6 +1457,7 @@ xfs_link(
 {
xfs_mount_t *mp = tdp->i_mount;
xfs_trans_t *tp;
+   struct xfs_trans_res*tres;
int error;
xfs_bmap_free_t free_list;
xfs_fsblock_t   first_block;
@@ -1480,10 +1483,16 @@ xfs_link(
tp = xfs_trans_alloc(mp, XFS_TRANS_LINK);
cancel_flags = XFS_TRANS_RELEASE_LOG_RES;
resblks = XFS_LINK_SPACE_RES(mp, target_name->len);
-   error = xfs_trans_reserve(tp, &M_RES(mp)->tr_link, resblks, 0);
+
+   if (sip->i_d.di_nlink == 0)
+   tres = &M_RES(mp)->tr_link_tmpfile;
+   else
+   tres = &M_RES(mp)->tr_link;
+
+   error = xfs_trans_reserve(tp, tres, resblks, 0);
if (error == ENOSPC) {
resblks = 0;
-   error = xfs_trans_reserve(tp, &M_RES(mp)->tr_link, 0, 0);
+   error = xfs_trans_reserve(tp, tres, 0, 0);
}
if (error) {
cancel_flags = 0;
@@ -1512,6 +1521,12 @@ xfs_link(
 
xfs_bmap_init(&free_list, &first_block);
 
+   if (sip->i_d.di_nlink == 0) {
+   error = xfs_iunlink_remove(tp, sip);
+   if (error)
+   goto abort_return;
+   }
+
error = xfs_dir_createname(tp, tdp, target_name, sip->i_ino,
&first_block, &free_list, resblks);
if (error)
diff --git a/fs/xfs/xfs_trans_resv.c b/fs/xfs/xfs_trans_resv.c
index 04519a9..f2da7f4 100644
--- a/fs/xfs/xfs_trans_resv.c
+++ b/fs/xfs/xfs_trans_resv.c
@@ -228,6 +228,22 @@ xfs_calc_link_reservation(
  XFS_FSB_TO_B(mp, 1;
 }
 
+/* For creating a link to an O_TMPFILE inode, except modifying
+ * those metadata for regular inode, we still need to remove an inode
+ * from unlinked list at first. That is,  we can modify:
+ *the agi hash list and counters: sector size
+ *the on disk inode before ours in the agi hash list: inode cluster size
+ */
+STATIC uint
+xfs_calc_link_tmpfile_reservation(
+   struct xfs_mount*mp)
+{
+   return xfs_calc_link_reservation(mp) +
+   xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) +
+   MAX((__uint16_t)XFS_FSB_TO_B(mp, 1),
+   (__uint16_t)XFS_INODE_CLUSTER_SIZE(mp));
+}
+
 /*
  * For removing a directory entry we can modify:
  *the parent directory inode: inode size
@@ -743,6 +759,10 @@ xfs_trans_resv_calc(
resp->tr_link.tr_logcount = XFS_LINK_LOG_COUNT;
resp->tr_link.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
 
+   resp->tr_link_tmpfile.tr_logres = xfs_calc_link_tmpfile_reservation(mp);
+   resp->tr_link_tmpfile.tr_logcount = XFS_LINK_TMPFILE_LOG_COUNT;
+   resp->tr_link_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
+
resp->tr_remove.tr_logres = xfs_calc_remove_reservation(mp);
resp->tr_remove.tr_logcount = XFS_REMOVE_LOG_COUNT;
resp->tr_remove.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
diff --git a/fs/xfs/xfs_trans_resv.h b/fs/xfs/xfs_trans_resv.h
index 285621d..86a0daf 100644
--- a/fs/xfs/xfs_trans_resv.h
+++ b/fs/xfs/xfs_trans_resv.h
@@ -35,6 +35,7 @@ struct xfs_trans_resv {
struct xfs_trans_restr_itruncate;   /* truncate trans */
struct xfs_trans_restr_rename;  /* rename trans */
struct xfs_trans_restr_link;/* link trans */
+   struct xfs_trans_restr_link_tmpfile; /* link O_TMPFILE trans */
struct xfs_trans_restr_remove;  /* unlink trans */
struct xfs_trans_restr_symlink; /* symlink trans */
struct xfs_trans_restr_create;  /* create trans */
@@ -106,6 +107,7 @@ struct xfs_trans_resv {
 #defineXFS_SYMLINK_LOG_COUNT   3
 #defineXFS_REM

[PATCH 3/5] xfs: add xfs_create_tmpfile() for O_TMPFILE support

2013-12-13 Thread Zhi Yong Wu
From: Zhi Yong Wu 

The function is used to create one O_TMPFILE file.
For more info, please refer to:
  http://oss.sgi.com/archives/xfs/2013-08/msg00339.html

Signed-off-by: Zhi Yong Wu 
---
 fs/xfs/xfs_inode.c  |  106 +++
 fs/xfs/xfs_inode.h  |2 +
 fs/xfs/xfs_shared.h |4 +-
 fs/xfs/xfs_trans_resv.c |   31 ++
 fs/xfs/xfs_trans_resv.h |2 +
 5 files changed, 144 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 71a8186..48e09c5 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1342,6 +1342,112 @@ xfs_create(
 }
 
 int
+xfs_create_tmpfile(
+   struct xfs_inode*dp,
+   struct xfs_mount*mp,
+   umode_t mode,
+   dev_t   rdev,
+   struct xfs_inode**ipp)
+{
+   struct xfs_inode*ip = NULL;
+   struct xfs_trans*tp = NULL;
+   int error;
+   uintcancel_flags = XFS_TRANS_RELEASE_LOG_RES;
+   struct xfs_dquot*udqp = NULL;
+   struct xfs_dquot*gdqp = NULL;
+   struct xfs_dquot*pdqp = NULL;
+   struct xfs_trans_res*tres;
+   uintresblks;
+
+   if (XFS_FORCED_SHUTDOWN(mp))
+   return XFS_ERROR(EIO);
+
+   /*
+* Make sure that we have allocated dquot(s) on disk.
+*/
+   error = xfs_qm_vop_dqalloc(dp, mp, xfs_kuid_to_uid(current_fsuid()),
+   xfs_kgid_to_gid(current_fsgid()),
+   xfs_get_initial_prid(dp),
+   XFS_QMOPT_QUOTALL | XFS_QMOPT_INHERIT,
+   &udqp, &gdqp, &pdqp);
+   if (error)
+   return error;
+
+   resblks = XFS_IALLOC_SPACE_RES(mp);
+   tp = xfs_trans_alloc(mp, XFS_TRANS_CREATE_TMPFILE);
+
+   tres = &M_RES(mp)->tr_create_tmpfile;
+   error = xfs_trans_reserve(tp, tres, resblks, 0);
+   if (error == ENOSPC) {
+   /* No space at all so try a "no-allocation" reservation */
+   resblks = 0;
+   error = xfs_trans_reserve(tp, tres, 0, 0);
+   }
+   if (error) {
+   cancel_flags = 0;
+   goto out_trans_cancel;
+   }
+
+   error = xfs_trans_reserve_quota(tp, mp, udqp, gdqp,
+   pdqp, resblks, 1, 0);
+   if (error)
+   goto out_trans_cancel;
+
+   error = xfs_dir_ialloc(&tp, NULL, mode, 0, rdev,
+   XFS_PROJID_DEFAULT, resblks > 0,
+   &ip, NULL);
+   if (error) {
+   if (error == ENOSPC)
+   goto out_trans_cancel;
+   goto out_trans_abort;
+   }
+
+   if (mp->m_flags & XFS_MOUNT_WSYNC)
+   xfs_trans_set_sync(tp);
+
+   /*
+* Attach the dquot(s) to the inodes and modify them incore.
+* These ids of the inode couldn't have changed since the new
+* inode has been locked ever since it was created.
+*/
+   xfs_qm_vop_create_dqattach(tp, ip, udqp, gdqp, pdqp);
+
+   error = xfs_iunlink(tp, ip);
+   if (error)
+   goto out_trans_abort;
+
+   error = xfs_trans_commit(tp, XFS_TRANS_RELEASE_LOG_RES);
+   if (error)
+   goto out_release_inode;
+
+   xfs_qm_dqrele(udqp);
+   xfs_qm_dqrele(gdqp);
+   xfs_qm_dqrele(pdqp);
+
+   *ipp = ip;
+   return 0;
+
+ out_trans_abort:
+   cancel_flags |= XFS_TRANS_ABORT;
+ out_trans_cancel:
+   xfs_trans_cancel(tp, cancel_flags);
+ out_release_inode:
+   /*
+* Wait until after the current transaction is aborted to
+* release the inode.  This prevents recursive transactions
+* and deadlocks from xfs_inactive.
+*/
+   if (ip)
+   IRELE(ip);
+
+   xfs_qm_dqrele(udqp);
+   xfs_qm_dqrele(gdqp);
+   xfs_qm_dqrele(pdqp);
+
+   return error;
+}
+
+int
 xfs_link(
xfs_inode_t *tdp,
xfs_inode_t *sip,
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 9e6efccb..5699cc6 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -323,6 +323,8 @@ int xfs_lookup(struct xfs_inode *dp, struct 
xfs_name *name,
   struct xfs_inode **ipp, struct xfs_name *ci_name);
 intxfs_create(struct xfs_inode *dp, struct xfs_name *name,
   umode_t mode, xfs_dev_t rdev, struct xfs_inode 
**ipp);
+intxfs_create_tmpfile(struct xfs_inode *dp, struct xfs_mount *mp,
+  umode_t mode, xfs_dev_t rdev, struct xfs_inode 
**ipp);
 intxfs_remove(struct xfs_inode *dp, struct xfs_name *name,
   struct xfs_inode 

Re: [PATCH v6 00/11] VFS hot tracking

2013-12-11 Thread Zhi Yong Wu
Ping ^ 7

On Wed, Nov 6, 2013 at 9:45 PM, Zhi Yong Wu  wrote:
> From: Zhi Yong Wu 
>
>   The patchset is trying to introduce hot tracking function in
> VFS layer, which will keep track of real disk I/O in memory.
> By it, you will easily know more details about disk I/O, and
> then detect where disk I/O hot spots are. Also, specific FS
> can take use of it to do accurate defragment, and hot relocation
> support, etc.
>
>   Now it's time to send out its V6 for external review, and
> any comments or ideas are appreciated, thanks.
>
> NOTE:
>
>   The patchset can be obtained via my kernel dev git on github:
> git://github.com/wuzhy/kernel.git hot_tracking
>   If you're interested, you can also review them via
> https://github.com/wuzhy/kernel/commits/hot_tracking
>
>   For how to use and more other info and performance report,
> please check hot_tracking.txt in Documentation and following
> links:
>   1.) http://lwn.net/Articles/525651/
>   2.) https://lkml.org/lkml/2012/12/20/199
>
>   This patchset has been done scalability or performance tests
> by fs_mark, ffsb and compilebench.
>
>   The perf testings were done on Linux 3.12.0-rc7 with Model IBM,8231-E2C
> Big Endian PPC64 with 64 CPUs and 2 NUMA nodes, 250G RAM and 1.50 TiB
> test hard disk where each test file size is 20G or 100G.
> Architecture:  ppc64
> Byte Order:Big Endian
> CPU(s):64
> On-line CPU(s) list:   0-63
> Thread(s) per core:4
> Core(s) per socket:1
> Socket(s): 16
> NUMA node(s):  2
> Model: IBM,8231-E2C
> Hypervisor vendor: pHyp
> Virtualization type:   full
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  4096K
> NUMA node0 CPU(s): 0-31
> NUMA node1 CPU(s): 32-63
>
>   Below is the perf testing report:
>
>   Please focus on the two key points:
>   - The overall overhead which is injected by the patchset
>   - The stability of the perf results
>
> 1. fio tests
>
> w/o hot tracking   w/ 
> hot tracking
>
> RAM size32G  32G 16G   8G 
>   4G   2G  250G
>
> sequential-8k-1jobs-read 61260KB/s60918KB/s60901KB/s
> 62610KB/s60992KB/s60213KB/s60948KB/s
>
> sequential-8k-1jobs-write 1329KB/s 1329KB/s 1328KB/s 
> 1329KB/s 1328KB/s 1329KB/s 1329KB/s
>
> sequential-8k-8jobs-read 91139KB/s92614KB/s90907KB/s
> 89895KB/s92022KB/s90851KB/s91877KB/s
>
> sequential-8k-8jobs-write 2523KB/s 2522KB/s 2516KB/s 
> 2521KB/s 2516KB/s 2518KB/s 2521KB/s
>
> sequential-256k-1jobs-read  151432KB/s   151403KB/s   151406KB/s   
> 151422KB/s   151344KB/s   151446KB/s   151372KB/s
>
> sequential-256k-1jobs-write  33451KB/s33470KB/s33481KB/s
> 33470KB/s33459KB/s33472KB/s33477KB/s
>
> sequential-256k-8jobs-read  235291KB/s   234555KB/s   234251KB/s   
> 233656KB/s   234927KB/s   236380KB/s   235535KB/s
>
> sequential-256k-8jobs-write  62419KB/s62402KB/s62191KB/s
> 62859KB/s62629KB/s62720KB/s62523KB/s
>
> random-io-mix-8k-1jobs  [READ]2929KB/s 2942KB/s 2946KB/s 
> 2929KB/s 2934KB/s 2947KB/s 2946KB/s
> [WRITE]   1262KB/s 1266KB/s 1257KB/s 
> 1262KB/s 1257KB/s 1257KB/s 1265KB/s
>
> random-io-mix-8k-8jobs  [READ]2444KB/s 2442KB/s 2436KB/s 
> 2416KB/s 2353KB/s 2441KB/s 2442KB/s
> [WRITE]   1047KB/s 1044KB/s 1047KB/s 
> 1028KB/s 1017KB/s 1034KB/s 1049KB/s
>
> random-io-mix-8k-16jobs [READ]2182KB/s 2184KB/s 2169KB/s 
> 2178KB/s 2190KB/s 2184KB/s 2180KB/s
> [WRITE]932KB/s  930KB/s  943KB/s  
> 936KB/s  937KB/s  929KB/s  931KB/s
>
> The above perf parameter is the aggregate bandwidth of threads in the group;
> If you hope to know how about other perf parameters, or fio raw results, 
> please let me know, thanks.
>
> 2. Locking stat - Contention & Cacheline Bouncing
>
> RAM size class name con-bounces  contentions  acq-bounces   
> acquisitions   cacheline bouncing  locking contention
>   
>ratio  ratio
>
>   &(&root->t_lock)->rlock:  15081592 157834  
> 3

Re: [PATCH v6 07/11] VFS hot tracking: Add a /proc interface to control memory usage

2013-12-11 Thread Zhi Yong Wu
Ping ^ 7

On Wed, Nov 6, 2013 at 9:45 PM, Zhi Yong Wu  wrote:
> From: Zhi Yong Wu 
>
> Introduce a /proc interface hot-mem-high-thresh and
> to cap the memory which is consumed by hot_inode_item
> and hot_range_item, and they will be in the unit of
> 1M bytes.
>
> Signed-off-by: Chandra Seetharaman 
> Signed-off-by: Zhi Yong Wu 
> ---
>  fs/hot_tracking.c| 29 +
>  fs/hot_tracking.h| 23 +++
>  include/linux/hot_tracking.h |  3 +++
>  kernel/sysctl.c  |  7 +++
>  4 files changed, 62 insertions(+)
>
> diff --git a/fs/hot_tracking.c b/fs/hot_tracking.c
> index 7a9bd4f..2c5a7fd 100644
> --- a/fs/hot_tracking.c
> +++ b/fs/hot_tracking.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include "hot_tracking.h"
>
> +int sysctl_hot_mem_high_thresh __read_mostly = 0;
>  int sysctl_hot_update_interval __read_mostly = 150;
>
>  /* kmem_cache pointers for slab caches */
> @@ -32,6 +33,7 @@ static void hot_range_item_init(struct hot_range_item *hr,
> hr->len = 1 << RANGE_BITS;
> hr->hot_inode = he;
> atomic_long_inc(&he->hot_root->hot_cnt);
> +   hot_mem_limit_add(he->hot_root, sizeof(struct hot_range_item));
>  }
>
>  static void hot_range_item_free_cb(struct rcu_head *head)
> @@ -55,6 +57,7 @@ static void hot_range_item_free(struct kref *kref)
> spin_unlock(&root->m_lock);
>
> atomic_long_dec(&root->hot_cnt);
> +   hot_mem_limit_sub(root, sizeof(struct hot_range_item));
> call_rcu(&hr->rcu, hot_range_item_free_cb);
>  }
>
> @@ -103,6 +106,8 @@ redo:
>  * newly allocated item.
>  */
> atomic_long_dec(&he->hot_root->hot_cnt);
> +   hot_mem_limit_sub(he->hot_root,
> +   sizeof(struct 
> hot_range_item));
> kmem_cache_free(hot_range_item_cachep, 
> hr_new);
> }
> spin_unlock(&he->i_lock);
> @@ -205,6 +210,7 @@ static void hot_inode_item_init(struct hot_inode_item *he,
> he->hot_root = root;
> spin_lock_init(&he->i_lock);
> atomic_long_inc(&root->hot_cnt);
> +   hot_mem_limit_add(root, sizeof(struct hot_inode_item));
>  }
>
>  static void hot_inode_item_free_cb(struct rcu_head *head)
> @@ -226,6 +232,7 @@ static void hot_inode_item_free(struct kref *kref)
> hot_range_tree_free(he);
>
> atomic_long_dec(&he->hot_root->hot_cnt);
> +   hot_mem_limit_sub(he->hot_root, sizeof(struct hot_inode_item));
> call_rcu(&he->rcu, hot_inode_item_free_cb);
>  }
>
> @@ -272,6 +279,8 @@ redo:
>  * newly allocated item.
>  */
> atomic_long_dec(&root->hot_cnt);
> +   hot_mem_limit_sub(root,
> +   sizeof(struct 
> hot_inode_item));
> kmem_cache_free(hot_inode_item_cachep, 
> he_new);
> }
> spin_unlock(&root->t_lock);
> @@ -534,6 +543,23 @@ static unsigned long hot_item_evict(struct hot_info 
> *root, unsigned long work,
> return freed;
>  }
>
> +static void hot_mem_evict(struct hot_info *root)
> +{
> +   unsigned long sum, thresh;
> +
> +   if (sysctl_hot_mem_high_thresh == 0)
> +   return;
> +
> +   sum = hot_mem_limit_sum(root);
> +   /* Note: sysctl_** is in the unit of 1M bytes */
> +   thresh = sysctl_hot_mem_high_thresh;
> +   thresh *= 1024 * 1024;
> +   if (sum <= thresh)
> +   return;
> +
> +   hot_item_evict(root, sum - thresh, hot_mem_limit_sum);
> +}
> +
>  /*
>   * Every sync period we update temperatures for
>   * each hot inode item and hot range item for aging
> @@ -546,6 +572,8 @@ static void hot_update_worker(struct work_struct *work)
> struct hot_inode_item *he;
> struct rb_node *node;
>
> +   hot_mem_evict(root);
> +
> rcu_read_lock();
> node = root->hot_inode_tree.rb_node;
> while (node) {
> @@ -753,6 +781,7 @@ int hot_track_init(struct super_block *sb)
> goto err;
> }
>
> +   hot_mem_limit_init(root);
> sb->s_hot_root = root;
> sb->s_flags |= MS_HOTTRACK;
>
> diff --git a/fs/hot_tracking.h

[PATCH] vfs, eventfd: fix the typo

2013-12-11 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 fs/eventfd.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 35470d9..710bf80 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -45,7 +45,7 @@ struct eventfd_ctx {
  *
  * This function is supposed to be called by the kernel in paths that do not
  * allow sleeping. In this function we allow the counter to reach the 
ULLONG_MAX
- * value, and we signal this as overflow condition by returining a POLLERR
+ * value, and we signal this as overflow condition by returning a POLLERR
  * to poll(2).
  *
  * Returns the amount by which the counter was incrememnted.  This will be less
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()

2013-12-10 Thread Zhi Yong Wu
On Wed, Dec 11, 2013 at 11:19 AM, David Miller  wrote:
> From: Zhi Yong Wu 
> Date: Wed, 11 Dec 2013 11:14:04 +0800
>
>> Only one reminder, since David has committed the two patches, you
>> maybe need to take their impact on your patches into account.
>
> I reverted these changes from net-next.
So rapid:), thanks for your reminder.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()

2013-12-10 Thread Zhi Yong Wu
Only one reminder, since David has committed the two patches, you
maybe need to take their impact on your patches into account.

On Wed, Dec 11, 2013 at 3:00 AM, David Miller  wrote:
> From: Vlad Yasevich 
> Date: Tue, 10 Dec 2013 12:18:09 -0500
>
>> On 12/09/2013 08:36 PM, David Miller wrote:
>>> From: Zhi Yong Wu 
>>> Date: Sat,  7 Dec 2013 04:55:00 +0800
>>>
>>>> From: Zhi Yong Wu 
>>>>
>>>> By checking related codes, it is impossible that ret > len or total_len,
>>>> so we should remove some useless codes in both above functions.
>>>>
>>>> Signed-off-by: Zhi Yong Wu 
>>>
>>> Applied.
>>
>> Wait a sec.  We want to be able to return a value bigger then len
>> to trigger a MSG_TRUNC.  Jason has patches for to fix this.  If you
>> apply this, we'll have to re-introduce this code back in.
>>
>> Same goes for patch 1/2.
>
> That's fine, right now the code makes no sense as the condition can
> never be triggered so there is no harm removing the illogical code
> meanwhile.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] iproute2 3.12.0 release

2013-12-08 Thread Zhi Yong Wu
HI

The manpage of tc hasn't any info related to "tc action", is there any
plan to add it recently? or do i miss anything?

On Sat, Nov 23, 2013 at 9:20 AM, Stephen Hemminger
 wrote:
> A little late but ready and toasty warm here is iproute2 to go with
> 3.12.0 (aka One Giant Leap for Frogkind).
>
> In addition to the usual build  documentation fixes, this
> version includes support for ipv6 on vxlan and GRE.
> As well as fair queue packet scheduler.
>
> If you have been sitting on changes to iproute2 that are in
> net-next for 3.12 merge window, please submit them now.
>
> Iproute2 package is available at:
>   http://kernel.org/pub/linux/utils/net/iproute2/iproute2-3.12.0.tar.gz
>
> You can download the source from:
>   git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git
>
> Stay Warm!
>
> ---
> Andreas Henriksson (1):
>   ss: avoid passing negative numbers to malloc
>
> Christophe Gouault (1):
>   xfrm: enable to set non-wildcard mark 0 on SAs and SPs
>
> Eric Dumazet (3):
>   pkt_sched: fq: Fair Queue packet scheduler
>   tc: support TCA_STATS_RATE_EST64
>   htb: add support for direct_qlen attribute
>
> Fan Du (1):
>   xfrm: use memcpy to suppress gcc phony buffer overflow warning.
>
> Hangbin Liu (1):
>   ipaddrlabel: use uint32_t instead of int32_t
>
> Jamal Hadi Salim (2):
>   tc: introduce simple action
>   action: typo nat fix
>
> Nicolas Dichtel (1):
>   iplink: update available type list
>
> Nigel Kukard (1):
>   Fix tc stats when using -batch mode
>
> Petr Písař (2):
>   iproute2: bridge: document mdb
>   iproute2: bridge: Close file with bridge monitor file
>
> Sami Kerola (1):
>   ip: make -resolve addr to print names rather than addresses
>
> Stefan Tomanek (1):
>   ip rule: add route suppression options
>
> Stephen Hemminger (14):
>   Update kernel headers to net-next for 3.12
>   Update to 3.11 net-next kernel headers
>   nstat: add json output format
>   Update to 3.12-rc1 headers
>   nstat: revise json output
>   ifstat: add json output format
>   lnstat: add json output format
>   lnstat, nstat, ifstat: update man pages
>   tc: add default action to kernel headers
>   ipv6 gre: add entry to ether types
>   Fix handling of qdis without options
>   htb: remove old unused duplicate qdisc name
>   update kernel headers
>   v3.12.0
>
> WANG Cong (1):
>   vxlan: add ipv6 support
>
> x...@mail.ru (2):
>   iproute2: GRE over IPv6 tunnel support.
>   iproute2: ip6gre: update man pages
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

By checking related codes, it is impossible that ret > len or total_len,
so we should remove some useless codes in both above functions.

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |5 -
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index f9c935a..d61719c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1354,7 +1354,6 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
 
ret = tun_do_read(tun, tfile, iv, len,
  file->f_flags & O_NONBLOCK);
-   ret = min_t(ssize_t, ret, len);
 out:
tun_put(tun);
return ret;
@@ -1453,10 +1452,6 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket 
*sock,
}
ret = tun_do_read(tun, tfile, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
-   if (ret > total_len) {
-   m->msg_flags |= MSG_TRUNC;
-   ret = flags & MSG_TRUNC ? ret : total_len;
-   }
 out:
tun_put(tun);
return ret;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] macvtap: remove useless codes in macvtap_aio_read() and macvtap_recvmsg()

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

By checking related codes, it is impossible that ret > len or total_len,
so we should remove some useless coeds in both above functions.

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |5 -
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 4c6f84c..7f4ccdd 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -871,7 +871,6 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
}
 
ret = macvtap_do_read(q, iv, len, file->f_flags & O_NONBLOCK);
-   ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
 out:
return ret;
 }
@@ -1104,10 +1103,6 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct 
socket *sock,
return -EINVAL;
ret = macvtap_do_read(q, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
-   if (ret > total_len) {
-   m->msg_flags |= MSG_TRUNC;
-   ret = flags & MSG_TRUNC ? ret : total_len;
-   }
return ret;
 }
 
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] tun: update file current position

2013-12-06 Thread Zhi Yong Wu
On Sat, Dec 7, 2013 at 1:45 AM, David Miller  wrote:
> From: Zhi Yong Wu 
> Date: Fri,  6 Dec 2013 17:08:50 +0800
>
>> From: Zhi Yong Wu 
>>
>> Signed-off-by: Zhi Yong Wu 
>
> Also applied and queued up for -stable, thanks.
>
> I noticed in these two cases that that min_t() adjustment of 'ret'
> seems strange.  I can't understand why it's needed.
>
> If, for example, tun_do_read() really did read more than 'len'
> bytes:
>
> 1) That would write past the end of the buffer.
>
> 2) Writing a different value to the ->ki_pos would mean
>that ->ki_pos is now inaccurate.
>
> Unless someone can explain why the min_t() is needed, we should remove
> it.
Yes, from my side, it seems to be impossible that ret is bigger than
let or total_len.
So we also remove the branch "if (ret > total_len) {...}" in xxx_rcvmsg().
If you hope to submit the patch for this, please let me know, thanks.


-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 net-next 2/4] macvtap: remove the dead branch

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |8 ++--
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9093004..f599c47 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -588,7 +588,7 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
return 0;
 }
 
-static int macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
+static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
   struct virtio_net_hdr *vnet_hdr)
 {
memset(vnet_hdr, 0, sizeof(*vnet_hdr));
@@ -619,8 +619,6 @@ static int macvtap_skb_to_vnet_hdr(const struct sk_buff 
*skb,
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
-
-   return 0;
 }
 
 /* Get packet from user space buffer */
@@ -778,9 +776,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
if ((len -= vnet_hdr_len) < 0)
return -EINVAL;
 
-   ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
-   if (ret)
-   return ret;
+   macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 net-next 4/4] tun: remove unused parameter in tun_do_read()

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 782e38b..f9c935a 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1289,8 +1289,7 @@ done:
 }
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
-  struct kiocb *iocb, const struct iovec *iv,
-  ssize_t len, int noblock)
+  const struct iovec *iv, ssize_t len, int noblock)
 {
DECLARE_WAITQUEUE(wait, current);
struct sk_buff *skb;
@@ -1353,7 +1352,7 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = tun_do_read(tun, tfile, iocb, iv, len,
+   ret = tun_do_read(tun, tfile, iv, len,
  file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
 out:
@@ -1452,7 +1451,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket 
*sock,
 SOL_PACKET, TUN_TX_TIMESTAMP);
goto out;
}
-   ret = tun_do_read(tun, tfile, iocb, m->msg_iov, total_len,
+   ret = tun_do_read(tun, tfile, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 net-next 3/4] macvtap: remove unused parameter in macvtap_do_read()

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index f599c47..4c6f84c 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -819,7 +819,7 @@ done:
return ret ? ret : copied;
 }
 
-static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
+static ssize_t macvtap_do_read(struct macvtap_queue *q,
   const struct iovec *iv, unsigned long len,
   int noblock)
 {
@@ -870,7 +870,7 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
+   ret = macvtap_do_read(q, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
 out:
return ret;
@@ -1102,7 +1102,7 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct 
socket *sock,
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = macvtap_do_read(q, iocb, m->msg_iov, total_len,
+   ret = macvtap_do_read(q, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 net-next 0/4] net: some cleanups

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since net-next is open now, it's time to post them out again.

Changelog from v3:
 -combine the change that removes the return value check with
the change which adjusts the function return type to "void". [David Miller]

Zhi Yong Wu (4):
  vhost: remove the dead branch
  macvtap: remove the dead branch
  macvtap: remove unused parameter in macvtap_do_read()
  tun: remove unused parameter in tun_do_read()

 drivers/net/macvtap.c |   14 +-
 drivers/net/tun.c |7 +++
 drivers/vhost/net.c   |9 ++---
 drivers/vhost/scsi.c  |7 +--
 drivers/vhost/test.c  |8 +---
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 7 files changed, 14 insertions(+), 37 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 net-next 1/4] vhost: remove the dead branch

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c   |9 ++---
 drivers/vhost/scsi.c  |7 +--
 drivers/vhost/test.c  |8 +---
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 5 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 831eb4f..9a68409 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -683,7 +683,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r, i;
+   int i;
 
if (!n)
return -ENOMEM;
@@ -706,12 +706,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
}
-   r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
-   if (r < 0) {
-   kfree(n);
-   kfree(vqs);
-   return r;
-   }
+   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f175629..1e4c75c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1417,18 +1417,13 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
vqs[i] = &vs->vqs[i].vq;
vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick;
}
-   r = vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
+   vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
 
tcm_vhost_init_inflight(vs, NULL);
 
-   if (r < 0)
-   goto err_init;
-
f->private_data = vs;
return 0;
 
-err_init:
-   kfree(vqs);
 err_vqs:
vhost_scsi_free(vs);
 err_vs:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 339eae8..c2a54fb 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -104,7 +104,6 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
struct vhost_test *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r;
 
if (!n)
return -ENOMEM;
@@ -117,12 +116,7 @@ static int vhost_test_open(struct inode *inode, struct 
file *f)
dev = &n->dev;
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
-   if (r < 0) {
-   kfree(vqs);
-   kfree(n);
-   return r;
-   }
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
 
f->private_data = n;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 69068e0..78987e4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -290,7 +290,7 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev)
vhost_vq_free_iovecs(dev->vqs[i]);
 }
 
-long vhost_dev_init(struct vhost_dev *dev,
+void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs)
 {
struct vhost_virtqueue *vq;
@@ -319,8 +319,6 @@ long vhost_dev_init(struct vhost_dev *dev,
vhost_poll_init(&vq->poll, vq->handle_kick,
POLLIN, dev);
}
-
-   return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_dev_init);
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4465ed5..35eeb2a 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -127,7 +127,7 @@ struct vhost_dev {
struct task_struct *worker;
 };
 
-long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
+void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
 long vhost_dev_set_owner(struct vhost_dev *dev);
 bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] macvtap: update file current position

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9093004..e6e2dd6 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -876,6 +876,8 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
 
ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
+   if (ret > 0)
+   iocb->ki_pos += ret;
 out:
return ret;
 }
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] tun: update file current position

2013-12-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 782e38b..c8ddbd0 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1356,6 +1356,8 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
ret = tun_do_read(tun, tfile, iocb, iv, len,
  file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
+   if (ret > 0)
+   iocb->ki_pos += ret;
 out:
tun_put(tun);
return ret;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH v3 3/6] macvtap: remove the dead branch

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9093004..d271fb4 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -779,8 +779,6 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
return -EINVAL;
 
ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
-   if (ret)
-   return ret;
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH v3 2/6] vhost: adjust vhost_dev_init() to be void

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c   |4 ++--
 drivers/vhost/scsi.c  |2 +-
 drivers/vhost/test.c  |3 +--
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 5 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 0554785..9a68409 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -683,7 +683,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r, i;
+   int i;
 
if (!n)
return -ENOMEM;
@@ -706,7 +706,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
}
-   r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3164680..1e4c75c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1417,7 +1417,7 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
vqs[i] = &vs->vqs[i].vq;
vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick;
}
-   r = vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
+   vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
 
tcm_vhost_init_inflight(vs, NULL);
 
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 99cb960..c2a54fb 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -104,7 +104,6 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
struct vhost_test *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r;
 
if (!n)
return -ENOMEM;
@@ -117,7 +116,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = &n->dev;
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
 
f->private_data = n;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 69068e0..78987e4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -290,7 +290,7 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev)
vhost_vq_free_iovecs(dev->vqs[i]);
 }
 
-long vhost_dev_init(struct vhost_dev *dev,
+void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs)
 {
struct vhost_virtqueue *vq;
@@ -319,8 +319,6 @@ long vhost_dev_init(struct vhost_dev *dev,
vhost_poll_init(&vq->poll, vq->handle_kick,
POLLIN, dev);
}
-
-   return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_dev_init);
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4465ed5..35eeb2a 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -127,7 +127,7 @@ struct vhost_dev {
struct task_struct *worker;
 };
 
-long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
+void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
 long vhost_dev_set_owner(struct vhost_dev *dev);
 bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH v3 5/6] macvtap: remove unused parameter in macvtap_do_read()

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index f599c47..4c6f84c 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -819,7 +819,7 @@ done:
return ret ? ret : copied;
 }
 
-static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
+static ssize_t macvtap_do_read(struct macvtap_queue *q,
   const struct iovec *iv, unsigned long len,
   int noblock)
 {
@@ -870,7 +870,7 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
+   ret = macvtap_do_read(q, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
 out:
return ret;
@@ -1102,7 +1102,7 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct 
socket *sock,
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = macvtap_do_read(q, iocb, m->msg_iov, total_len,
+   ret = macvtap_do_read(q, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH v3 4/6] macvtap: adjust macvtap_skb_to_vnet_hdr() to be void

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index d271fb4..f599c47 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -588,7 +588,7 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
return 0;
 }
 
-static int macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
+static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
   struct virtio_net_hdr *vnet_hdr)
 {
memset(vnet_hdr, 0, sizeof(*vnet_hdr));
@@ -619,8 +619,6 @@ static int macvtap_skb_to_vnet_hdr(const struct sk_buff 
*skb,
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
-
-   return 0;
 }
 
 /* Get packet from user space buffer */
@@ -778,7 +776,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
if ((len -= vnet_hdr_len) < 0)
return -EINVAL;
 
-   ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
+   macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH v3 6/6] tun: remove unused parameter in tun_do_read()

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 782e38b..f9c935a 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1289,8 +1289,7 @@ done:
 }
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
-  struct kiocb *iocb, const struct iovec *iv,
-  ssize_t len, int noblock)
+  const struct iovec *iv, ssize_t len, int noblock)
 {
DECLARE_WAITQUEUE(wait, current);
struct sk_buff *skb;
@@ -1353,7 +1352,7 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = tun_do_read(tun, tfile, iocb, iv, len,
+   ret = tun_do_read(tun, tfile, iv, len,
  file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
 out:
@@ -1452,7 +1451,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket 
*sock,
 SOL_PACKET, TUN_TX_TIMESTAMP);
goto out;
}
-   ret = tun_do_read(tun, tfile, iocb, m->msg_iov, total_len,
+   ret = tun_do_read(tun, tfile, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH v3 0/6] net: some cleanups

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since net-next is open now, it's time to post them again.

Zhi Yong Wu (6):
  vhost: remove the dead branch
  vhost: adjust vhost_dev_init() to be void
  macvtap: remove the dead branch
  macvtap: adjust macvtap_skb_to_vnet_hdr() to be void
  macvtap: remove unused parameter in macvtap_do_read()
  tun: remove unused parameter in tun_do_read()

 drivers/net/macvtap.c |   14 +-
 drivers/net/tun.c |7 +++
 drivers/vhost/net.c   |9 ++---
 drivers/vhost/scsi.c  |7 +--
 drivers/vhost/test.c  |8 +---
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 7 files changed, 14 insertions(+), 37 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH v3 1/6] vhost: remove the dead branch

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c  |5 -
 drivers/vhost/scsi.c |5 -
 drivers/vhost/test.c |5 -
 3 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 831eb4f..0554785 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -707,11 +707,6 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].sock_hlen = 0;
}
r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
-   if (r < 0) {
-   kfree(n);
-   kfree(vqs);
-   return r;
-   }
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f175629..3164680 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1421,14 +1421,9 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
 
tcm_vhost_init_inflight(vs, NULL);
 
-   if (r < 0)
-   goto err_init;
-
f->private_data = vs;
return 0;
 
-err_init:
-   kfree(vqs);
 err_vqs:
vhost_scsi_free(vs);
 err_vs:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 339eae8..99cb960 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -118,11 +118,6 @@ static int vhost_test_open(struct inode *inode, struct 
file *f)
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
-   if (r < 0) {
-   kfree(vqs);
-   kfree(n);
-   return r;
-   }
 
f->private_data = n;
 
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] macvtap: update file current position

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9093004..957cc5c 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -876,6 +876,8 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
 
ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
+   if (ret > 0)
+   iocb->ki_pos = ret;
 out:
return ret;
 }
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] tun: update file current position

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 782e38b..e26cbea 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1356,6 +1356,8 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
ret = tun_do_read(tun, tfile, iocb, iv, len,
  file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
+   if (ret > 0)
+   iocb->ki_pos = ret;
 out:
tun_put(tun);
return ret;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [net-next PATCH 3/6] macvtap: remove the dead branch

2013-12-05 Thread Zhi Yong Wu
On Fri, Dec 6, 2013 at 2:08 PM, Guenter Roeck  wrote:
> On 12/05/2013 02:28 PM, Zhi Yong Wu wrote:
>>
>> From: Zhi Yong Wu 
>>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>   drivers/net/macvtap.c |2 --
>>   1 files changed, 0 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
>> index 9093004..d271fb4 100644
>> --- a/drivers/net/macvtap.c
>> +++ b/drivers/net/macvtap.c
>> @@ -779,8 +779,6 @@ static ssize_t macvtap_put_user(struct macvtap_queue
>> *q,
>> return -EINVAL;
>>
>> ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
>> -   if (ret)
>> -   return ret;
>>
> Assigning the function's return value to ret just to ignore it seems odd.
>
> Might make sense to change the function type to void.
Yes,  this is done in the next patch of this series.

>
> Guenter
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 3/8] macvtap: remove the dead branch

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9093004..d271fb4 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -779,8 +779,6 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
return -EINVAL;
 
ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
-   if (ret)
-   return ret;
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 1/8] vhost: remove the dead branch

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c  |5 -
 drivers/vhost/scsi.c |5 -
 drivers/vhost/test.c |5 -
 3 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 831eb4f..0554785 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -707,11 +707,6 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].sock_hlen = 0;
}
r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
-   if (r < 0) {
-   kfree(n);
-   kfree(vqs);
-   return r;
-   }
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f175629..3164680 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1421,14 +1421,9 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
 
tcm_vhost_init_inflight(vs, NULL);
 
-   if (r < 0)
-   goto err_init;
-
f->private_data = vs;
return 0;
 
-err_init:
-   kfree(vqs);
 err_vqs:
vhost_scsi_free(vs);
 err_vs:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 339eae8..99cb960 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -118,11 +118,6 @@ static int vhost_test_open(struct inode *inode, struct 
file *f)
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
-   if (r < 0) {
-   kfree(vqs);
-   kfree(n);
-   return r;
-   }
 
f->private_data = n;
 
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 6/8] macvtap: remove unused parameter in macvtap_do_read()

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 4914d85..4a34bcb 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -819,7 +819,7 @@ done:
return ret ? ret : copied;
 }
 
-static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
+static ssize_t macvtap_do_read(struct macvtap_queue *q,
   const struct iovec *iv, unsigned long len,
   int noblock)
 {
@@ -870,7 +870,7 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
+   ret = macvtap_do_read(q, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
if (ret > 0)
iocb->ki_pos = ret;
@@ -1104,7 +1104,7 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct 
socket *sock,
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = macvtap_do_read(q, iocb, m->msg_iov, total_len,
+   ret = macvtap_do_read(q, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 7/8] tun: update file current position

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 782e38b..e26cbea 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1356,6 +1356,8 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
ret = tun_do_read(tun, tfile, iocb, iv, len,
  file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
+   if (ret > 0)
+   iocb->ki_pos = ret;
 out:
tun_put(tun);
return ret;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 5/8] macvtap: update file current position

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index f599c47..4914d85 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -872,6 +872,8 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
 
ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
+   if (ret > 0)
+   iocb->ki_pos = ret;
 out:
return ret;
 }
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 2/8] vhost: adjust vhost_dev_init() to be void

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c   |4 ++--
 drivers/vhost/scsi.c  |2 +-
 drivers/vhost/test.c  |3 +--
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 5 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 0554785..9a68409 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -683,7 +683,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r, i;
+   int i;
 
if (!n)
return -ENOMEM;
@@ -706,7 +706,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
}
-   r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3164680..1e4c75c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1417,7 +1417,7 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
vqs[i] = &vs->vqs[i].vq;
vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick;
}
-   r = vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
+   vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
 
tcm_vhost_init_inflight(vs, NULL);
 
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 99cb960..c2a54fb 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -104,7 +104,6 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
struct vhost_test *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r;
 
if (!n)
return -ENOMEM;
@@ -117,7 +116,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = &n->dev;
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
 
f->private_data = n;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 69068e0..78987e4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -290,7 +290,7 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev)
vhost_vq_free_iovecs(dev->vqs[i]);
 }
 
-long vhost_dev_init(struct vhost_dev *dev,
+void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs)
 {
struct vhost_virtqueue *vq;
@@ -319,8 +319,6 @@ long vhost_dev_init(struct vhost_dev *dev,
vhost_poll_init(&vq->poll, vq->handle_kick,
POLLIN, dev);
}
-
-   return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_dev_init);
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4465ed5..35eeb2a 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -127,7 +127,7 @@ struct vhost_dev {
struct task_struct *worker;
 };
 
-long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
+void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
 long vhost_dev_set_owner(struct vhost_dev *dev);
 bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 8/8] tun: remove unused parameter in tun_do_read()

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/tun.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index e26cbea..fd8cc47 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1289,8 +1289,7 @@ done:
 }
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
-  struct kiocb *iocb, const struct iovec *iv,
-  ssize_t len, int noblock)
+  const struct iovec *iv, ssize_t len, int noblock)
 {
DECLARE_WAITQUEUE(wait, current);
struct sk_buff *skb;
@@ -1353,7 +1352,7 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = tun_do_read(tun, tfile, iocb, iv, len,
+   ret = tun_do_read(tun, tfile, iv, len,
  file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
if (ret > 0)
@@ -1454,7 +1453,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket 
*sock,
 SOL_PACKET, TUN_TX_TIMESTAMP);
goto out;
}
-   ret = tun_do_read(tun, tfile, iocb, m->msg_iov, total_len,
+   ret = tun_do_read(tun, tfile, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 4/8] macvtap: adjust macvtap_skb_to_vnet_hdr() to be void

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index d271fb4..f599c47 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -588,7 +588,7 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
return 0;
 }
 
-static int macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
+static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
   struct virtio_net_hdr *vnet_hdr)
 {
memset(vnet_hdr, 0, sizeof(*vnet_hdr));
@@ -619,8 +619,6 @@ static int macvtap_skb_to_vnet_hdr(const struct sk_buff 
*skb,
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
-
-   return 0;
 }
 
 /* Get packet from user space buffer */
@@ -778,7 +776,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
if ((len -= vnet_hdr_len) < 0)
return -EINVAL;
 
-   ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
+   macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCHv2 0/8] net: some cleanups

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since net-next is open now, it's time to post them out again.

Zhi Yong Wu (8):
  vhost: remove the dead branch
  vhost: adjust vhost_dev_init() to be void
  macvtap: remove the dead branch
  macvtap: adjust macvtap_skb_to_vnet_hdr() to be void
  macvtap: update file current position
  macvtap: remove unused parameter in macvtap_do_read()
  tun: update file current position
  tun: remove unused parameter in tun_do_read()

 drivers/net/macvtap.c |   16 +++-
 drivers/net/tun.c |9 +
 drivers/vhost/net.c   |9 ++---
 drivers/vhost/scsi.c  |7 +--
 drivers/vhost/test.c  |8 +---
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 7 files changed, 18 insertions(+), 37 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] macvtap: update file current position

2013-12-05 Thread Zhi Yong Wu
On Fri, Dec 6, 2013 at 9:44 AM, David Miller  wrote:
> From: Zhi Yong Wu 
> Date: Wed,  4 Dec 2013 17:29:00 +0800
>
>> From: Zhi Yong Wu 
>>
>> Signed-off-by: Zhi Yong Wu 
>
> The tun driver seems to have the same exact bug, please if you are going
> to fix one then fix the other too.
will post v2 with tun bugfix.
>
> Thanks.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH 1/6] vhost: remove the dead branch

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c  |5 -
 drivers/vhost/scsi.c |5 -
 drivers/vhost/test.c |5 -
 3 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 831eb4f..0554785 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -707,11 +707,6 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].sock_hlen = 0;
}
r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
-   if (r < 0) {
-   kfree(n);
-   kfree(vqs);
-   return r;
-   }
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f175629..3164680 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1421,14 +1421,9 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
 
tcm_vhost_init_inflight(vs, NULL);
 
-   if (r < 0)
-   goto err_init;
-
f->private_data = vs;
return 0;
 
-err_init:
-   kfree(vqs);
 err_vqs:
vhost_scsi_free(vs);
 err_vs:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 339eae8..99cb960 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -118,11 +118,6 @@ static int vhost_test_open(struct inode *inode, struct 
file *f)
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
-   if (r < 0) {
-   kfree(vqs);
-   kfree(n);
-   return r;
-   }
 
f->private_data = n;
 
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH 3/6] macvtap: remove the dead branch

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9093004..d271fb4 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -779,8 +779,6 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
return -EINVAL;
 
ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
-   if (ret)
-   return ret;
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH 2/6] vhost: adjust vhost_dev_init() to be void

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c   |4 ++--
 drivers/vhost/scsi.c  |2 +-
 drivers/vhost/test.c  |3 +--
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 5 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 0554785..9a68409 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -683,7 +683,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r, i;
+   int i;
 
if (!n)
return -ENOMEM;
@@ -706,7 +706,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
}
-   r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3164680..1e4c75c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1417,7 +1417,7 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
vqs[i] = &vs->vqs[i].vq;
vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick;
}
-   r = vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
+   vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
 
tcm_vhost_init_inflight(vs, NULL);
 
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 99cb960..c2a54fb 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -104,7 +104,6 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
struct vhost_test *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r;
 
if (!n)
return -ENOMEM;
@@ -117,7 +116,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = &n->dev;
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
 
f->private_data = n;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 69068e0..78987e4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -290,7 +290,7 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev)
vhost_vq_free_iovecs(dev->vqs[i]);
 }
 
-long vhost_dev_init(struct vhost_dev *dev,
+void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs)
 {
struct vhost_virtqueue *vq;
@@ -319,8 +319,6 @@ long vhost_dev_init(struct vhost_dev *dev,
vhost_poll_init(&vq->poll, vq->handle_kick,
POLLIN, dev);
}
-
-   return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_dev_init);
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4465ed5..35eeb2a 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -127,7 +127,7 @@ struct vhost_dev {
struct task_struct *worker;
 };
 
-long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
+void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
 long vhost_dev_set_owner(struct vhost_dev *dev);
 bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH 6/6] macvtap: remove unused paramter in macvtap_do_read()

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 4914d85..4a34bcb 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -819,7 +819,7 @@ done:
return ret ? ret : copied;
 }
 
-static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
+static ssize_t macvtap_do_read(struct macvtap_queue *q,
   const struct iovec *iv, unsigned long len,
   int noblock)
 {
@@ -870,7 +870,7 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
+   ret = macvtap_do_read(q, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
if (ret > 0)
iocb->ki_pos = ret;
@@ -1104,7 +1104,7 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct 
socket *sock,
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = macvtap_do_read(q, iocb, m->msg_iov, total_len,
+   ret = macvtap_do_read(q, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH 4/6] macvtap: adjust macvtap_skb_to_vnet_hdr() to be void

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index d271fb4..f599c47 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -588,7 +588,7 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
return 0;
 }
 
-static int macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
+static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
   struct virtio_net_hdr *vnet_hdr)
 {
memset(vnet_hdr, 0, sizeof(*vnet_hdr));
@@ -619,8 +619,6 @@ static int macvtap_skb_to_vnet_hdr(const struct sk_buff 
*skb,
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
-
-   return 0;
 }
 
 /* Get packet from user space buffer */
@@ -778,7 +776,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
if ((len -= vnet_hdr_len) < 0)
return -EINVAL;
 
-   ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
+   macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH 0/6] some cleanups

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since net-next is open now, it's time to post them out again.

Zhi Yong Wu (6):
  vhost: remove the dead branch
  vhost: adjust vhost_dev_init() to be void
  macvtap: remove the dead branch
  macvtap: adjust macvtap_skb_to_vnet_hdr() to be void
  macvtap: update file current position
  macvtap: remove unused paramter in macvtap_do_read()

 drivers/net/macvtap.c |   16 +++-
 drivers/vhost/net.c   |9 ++---
 drivers/vhost/scsi.c  |7 +--
 drivers/vhost/test.c  |8 +---
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 6 files changed, 13 insertions(+), 33 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[net-next PATCH 5/6] macvtap: update file current position

2013-12-05 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index f599c47..4914d85 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -872,6 +872,8 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
 
ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
+   if (ret > 0)
+   iocb->ki_pos = ret;
 out:
return ret;
 }
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] macvtap: remove unused paramter in macvtap_do_read()

2013-12-04 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 957cc5c..c0d412e 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -823,7 +823,7 @@ done:
return ret ? ret : copied;
 }
 
-static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
+static ssize_t macvtap_do_read(struct macvtap_queue *q,
   const struct iovec *iv, unsigned long len,
   int noblock)
 {
@@ -874,7 +874,7 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
goto out;
}
 
-   ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
+   ret = macvtap_do_read(q, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
if (ret > 0)
iocb->ki_pos = ret;
@@ -1108,7 +1108,7 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct 
socket *sock,
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = macvtap_do_read(q, iocb, m->msg_iov, total_len,
+   ret = macvtap_do_read(q, m->msg_iov, total_len,
  flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] macvtap: update file current position

2013-12-04 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9093004..957cc5c 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -876,6 +876,8 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const 
struct iovec *iv,
 
ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
+   if (ret > 0)
+   iocb->ki_pos = ret;
 out:
return ret;
 }
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 00/11] VFS hot tracking

2013-12-03 Thread Zhi Yong Wu
Ping 6,

any reason why this patchset can't get reviewed so far? If no
comments, pls merge them.  Please don't force me to be impolite,
thanks.

On Sat, Nov 30, 2013 at 5:55 PM, Zhi Yong Wu  wrote:
> HI,
>
> Ping again
>
> On Thu, Nov 21, 2013 at 9:57 PM, Zhi Yong Wu  wrote:
>> HI, Maintainers
>>
>> Ping again....
>>
>> On Thu, Nov 14, 2013 at 2:33 AM, Zhi Yong Wu  wrote:
>>> Ping
>>>
>>> On Wed, Nov 6, 2013 at 9:45 PM, Zhi Yong Wu  wrote:
>>>> From: Zhi Yong Wu 
>>>>
>>>>   The patchset is trying to introduce hot tracking function in
>>>> VFS layer, which will keep track of real disk I/O in memory.
>>>> By it, you will easily know more details about disk I/O, and
>>>> then detect where disk I/O hot spots are. Also, specific FS
>>>> can take use of it to do accurate defragment, and hot relocation
>>>> support, etc.
>>>>
>>>>   Now it's time to send out its V6 for external review, and
>>>> any comments or ideas are appreciated, thanks.
>>>>
>>>> NOTE:
>>>>
>>>>   The patchset can be obtained via my kernel dev git on github:
>>>> git://github.com/wuzhy/kernel.git hot_tracking
>>>>   If you're interested, you can also review them via
>>>> https://github.com/wuzhy/kernel/commits/hot_tracking
>>>>
>>>>   For how to use and more other info and performance report,
>>>> please check hot_tracking.txt in Documentation and following
>>>> links:
>>>>   1.) http://lwn.net/Articles/525651/
>>>>   2.) https://lkml.org/lkml/2012/12/20/199
>>>>
>>>>   This patchset has been done scalability or performance tests
>>>> by fs_mark, ffsb and compilebench.
>>>>
>>>>   The perf testings were done on Linux 3.12.0-rc7 with Model IBM,8231-E2C
>>>> Big Endian PPC64 with 64 CPUs and 2 NUMA nodes, 250G RAM and 1.50 TiB
>>>> test hard disk where each test file size is 20G or 100G.
>>>> Architecture:  ppc64
>>>> Byte Order:Big Endian
>>>> CPU(s):64
>>>> On-line CPU(s) list:   0-63
>>>> Thread(s) per core:4
>>>> Core(s) per socket:1
>>>> Socket(s): 16
>>>> NUMA node(s):  2
>>>> Model: IBM,8231-E2C
>>>> Hypervisor vendor: pHyp
>>>> Virtualization type:   full
>>>> L1d cache: 32K
>>>> L1i cache: 32K
>>>> L2 cache:  256K
>>>> L3 cache:  4096K
>>>> NUMA node0 CPU(s): 0-31
>>>> NUMA node1 CPU(s): 32-63
>>>>
>>>>   Below is the perf testing report:
>>>>
>>>>   Please focus on the two key points:
>>>>   - The overall overhead which is injected by the patchset
>>>>   - The stability of the perf results
>>>>
>>>> 1. fio tests
>>>>
>>>> w/o hot tracking   
>>>> w/ hot tracking
>>>>
>>>> RAM size32G  32G 16G   
>>>> 8G   4G   2G  250G
>>>>
>>>> sequential-8k-1jobs-read 61260KB/s60918KB/s60901KB/s
>>>> 62610KB/s60992KB/s60213KB/s60948KB/s
>>>>
>>>> sequential-8k-1jobs-write 1329KB/s 1329KB/s 1328KB/s 
>>>> 1329KB/s 1328KB/s 1329KB/s 1329KB/s
>>>>
>>>> sequential-8k-8jobs-read 91139KB/s92614KB/s90907KB/s
>>>> 89895KB/s92022KB/s90851KB/s91877KB/s
>>>>
>>>> sequential-8k-8jobs-write 2523KB/s 2522KB/s 2516KB/s 
>>>> 2521KB/s 2516KB/s 2518KB/s 2521KB/s
>>>>
>>>> sequential-256k-1jobs-read  151432KB/s   151403KB/s   151406KB/s   
>>>> 151422KB/s   151344KB/s   151446KB/s   151372KB/s
>>>>
>>>> sequential-256k-1jobs-write  33451KB/s33470KB/s33481KB/s
>>>> 33470KB/s33459KB/s33472KB/s33477KB/s
>>>>
>>>> sequential-256k-8jobs-read  235291KB/s   234555KB/s   234251KB/s   
>>>> 233656KB/s   234927KB/s   236380KB/s   235535KB/s
>>>>
>>>> sequential-256k-8jobs-write  62419KB/s62402KB/s62191KB/s
>>>> 62859KB/s 

Re: [PATCH v6 00/11] VFS hot tracking

2013-11-30 Thread Zhi Yong Wu
HI,

Ping again

On Thu, Nov 21, 2013 at 9:57 PM, Zhi Yong Wu  wrote:
> HI, Maintainers
>
> Ping again
>
> On Thu, Nov 14, 2013 at 2:33 AM, Zhi Yong Wu  wrote:
>> Ping
>>
>> On Wed, Nov 6, 2013 at 9:45 PM, Zhi Yong Wu  wrote:
>>> From: Zhi Yong Wu 
>>>
>>>   The patchset is trying to introduce hot tracking function in
>>> VFS layer, which will keep track of real disk I/O in memory.
>>> By it, you will easily know more details about disk I/O, and
>>> then detect where disk I/O hot spots are. Also, specific FS
>>> can take use of it to do accurate defragment, and hot relocation
>>> support, etc.
>>>
>>>   Now it's time to send out its V6 for external review, and
>>> any comments or ideas are appreciated, thanks.
>>>
>>> NOTE:
>>>
>>>   The patchset can be obtained via my kernel dev git on github:
>>> git://github.com/wuzhy/kernel.git hot_tracking
>>>   If you're interested, you can also review them via
>>> https://github.com/wuzhy/kernel/commits/hot_tracking
>>>
>>>   For how to use and more other info and performance report,
>>> please check hot_tracking.txt in Documentation and following
>>> links:
>>>   1.) http://lwn.net/Articles/525651/
>>>   2.) https://lkml.org/lkml/2012/12/20/199
>>>
>>>   This patchset has been done scalability or performance tests
>>> by fs_mark, ffsb and compilebench.
>>>
>>>   The perf testings were done on Linux 3.12.0-rc7 with Model IBM,8231-E2C
>>> Big Endian PPC64 with 64 CPUs and 2 NUMA nodes, 250G RAM and 1.50 TiB
>>> test hard disk where each test file size is 20G or 100G.
>>> Architecture:  ppc64
>>> Byte Order:Big Endian
>>> CPU(s):64
>>> On-line CPU(s) list:   0-63
>>> Thread(s) per core:4
>>> Core(s) per socket:1
>>> Socket(s): 16
>>> NUMA node(s):  2
>>> Model: IBM,8231-E2C
>>> Hypervisor vendor: pHyp
>>> Virtualization type:   full
>>> L1d cache: 32K
>>> L1i cache: 32K
>>> L2 cache:  256K
>>> L3 cache:  4096K
>>> NUMA node0 CPU(s): 0-31
>>> NUMA node1 CPU(s): 32-63
>>>
>>>   Below is the perf testing report:
>>>
>>>   Please focus on the two key points:
>>>   - The overall overhead which is injected by the patchset
>>>   - The stability of the perf results
>>>
>>> 1. fio tests
>>>
>>> w/o hot tracking   
>>> w/ hot tracking
>>>
>>> RAM size32G  32G 16G   
>>> 8G   4G   2G  250G
>>>
>>> sequential-8k-1jobs-read 61260KB/s60918KB/s60901KB/s
>>> 62610KB/s60992KB/s60213KB/s60948KB/s
>>>
>>> sequential-8k-1jobs-write 1329KB/s 1329KB/s 1328KB/s 
>>> 1329KB/s 1328KB/s 1329KB/s 1329KB/s
>>>
>>> sequential-8k-8jobs-read 91139KB/s92614KB/s90907KB/s
>>> 89895KB/s92022KB/s90851KB/s91877KB/s
>>>
>>> sequential-8k-8jobs-write 2523KB/s 2522KB/s 2516KB/s 
>>> 2521KB/s 2516KB/s 2518KB/s 2521KB/s
>>>
>>> sequential-256k-1jobs-read  151432KB/s   151403KB/s   151406KB/s   
>>> 151422KB/s   151344KB/s   151446KB/s   151372KB/s
>>>
>>> sequential-256k-1jobs-write  33451KB/s33470KB/s33481KB/s
>>> 33470KB/s33459KB/s33472KB/s33477KB/s
>>>
>>> sequential-256k-8jobs-read  235291KB/s   234555KB/s   234251KB/s   
>>> 233656KB/s   234927KB/s   236380KB/s   235535KB/s
>>>
>>> sequential-256k-8jobs-write  62419KB/s62402KB/s62191KB/s
>>> 62859KB/s62629KB/s62720KB/s62523KB/s
>>>
>>> random-io-mix-8k-1jobs  [READ]2929KB/s 2942KB/s 2946KB/s 
>>> 2929KB/s 2934KB/s 2947KB/s 2946KB/s
>>> [WRITE]   1262KB/s 1266KB/s 1257KB/s 
>>> 1262KB/s 1257KB/s 1257KB/s 1265KB/s
>>>
>>> random-io-mix-8k-8jobs  [READ]2444KB/s 2442KB/s 2436KB/s 
>>> 2416KB/s 2353KB/s 2441KB/s 2442KB/s
>>> [WRITE]   1047KB/s 1044KB/s 1047KB/s 
>>&

Re: [RESEND PATCH net-next 0/4] net: several cleanups

2013-11-27 Thread Zhi Yong Wu
On Thu, Nov 28, 2013 at 12:43 PM, David Miller  wrote:
> From: Zhi Yong Wu 
> Date: Thu, 28 Nov 2013 09:31:29 +0800
>
>> Per David's request, it's time to resend them now.
>
> No, it is not the time.
>
> You should not submit these kinds of patches until the net-next
> tree is open again, and I make an announcement here when that
> is the case.  I have yet to make such an announcement, and I
> do not plan to do so for several days as I am travelling and
> will be busy dealing with my backlog of patches once I get
> home.
ok, will wait for your announcement to be out.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH net-next 1/4] vhost: remove the dead branch

2013-11-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c  |5 -
 drivers/vhost/scsi.c |5 -
 drivers/vhost/test.c |5 -
 3 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 831eb4f..0554785 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -707,11 +707,6 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].sock_hlen = 0;
}
r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
-   if (r < 0) {
-   kfree(n);
-   kfree(vqs);
-   return r;
-   }
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f175629..3164680 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1421,14 +1421,9 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
 
tcm_vhost_init_inflight(vs, NULL);
 
-   if (r < 0)
-   goto err_init;
-
f->private_data = vs;
return 0;
 
-err_init:
-   kfree(vqs);
 err_vqs:
vhost_scsi_free(vs);
 err_vs:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 339eae8..99cb960 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -118,11 +118,6 @@ static int vhost_test_open(struct inode *inode, struct 
file *f)
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
-   if (r < 0) {
-   kfree(vqs);
-   kfree(n);
-   return r;
-   }
 
f->private_data = n;
 
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH net-next 4/4] macvtap: adjust macvtap_skb_to_vnet_hdr() to be void

2013-11-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index eeb1a97..155d60e 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -588,7 +588,7 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
return 0;
 }
 
-static int macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
+static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
   struct virtio_net_hdr *vnet_hdr)
 {
memset(vnet_hdr, 0, sizeof(*vnet_hdr));
@@ -619,8 +619,6 @@ static int macvtap_skb_to_vnet_hdr(const struct sk_buff 
*skb,
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
-
-   return 0;
 }
 
 /* Get packet from user space buffer */
@@ -779,7 +777,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
if ((len -= vnet_hdr_len) < 0)
return -EINVAL;
 
-   ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
+   macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH net-next 3/4] macvtap: remove the dead branch

2013-11-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/macvtap.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index dc76670..eeb1a97 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -780,8 +780,6 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
return -EINVAL;
 
ret = macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
-   if (ret)
-   return ret;
 
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, 
sizeof(vnet_hdr)))
return -EFAULT;
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH net-next 0/4] net: several cleanups

2013-11-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Per David's request, it's time to resend them now.

Zhi Yong Wu (4):
  vhost: remove the dead branch
  vhost: adjust vhost_dev_init() to be void
  macvtap: remove the dead branch
  macvtap: adjust macvtap_skb_to_vnet_hdr() to be void

 drivers/net/macvtap.c |8 ++--
 drivers/vhost/net.c   |9 ++---
 drivers/vhost/scsi.c  |7 +--
 drivers/vhost/test.c  |8 +---
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 6 files changed, 8 insertions(+), 30 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH net-next 2/4] vhost: adjust vhost_dev_init() to be void

2013-11-27 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c   |4 ++--
 drivers/vhost/scsi.c  |2 +-
 drivers/vhost/test.c  |3 +--
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 5 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 0554785..9a68409 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -683,7 +683,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r, i;
+   int i;
 
if (!n)
return -ENOMEM;
@@ -706,7 +706,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
}
-   r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3164680..1e4c75c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1417,7 +1417,7 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
vqs[i] = &vs->vqs[i].vq;
vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick;
}
-   r = vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
+   vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
 
tcm_vhost_init_inflight(vs, NULL);
 
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 99cb960..c2a54fb 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -104,7 +104,6 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
struct vhost_test *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r;
 
if (!n)
return -ENOMEM;
@@ -117,7 +116,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = &n->dev;
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
 
f->private_data = n;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 69068e0..78987e4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -290,7 +290,7 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev)
vhost_vq_free_iovecs(dev->vqs[i]);
 }
 
-long vhost_dev_init(struct vhost_dev *dev,
+void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs)
 {
struct vhost_virtqueue *vq;
@@ -319,8 +319,6 @@ long vhost_dev_init(struct vhost_dev *dev,
vhost_poll_init(&vq->poll, vq->handle_kick,
POLLIN, dev);
}
-
-   return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_dev_init);
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4465ed5..35eeb2a 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -127,7 +127,7 @@ struct vhost_dev {
struct task_struct *worker;
 };
 
-long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
+void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
 long vhost_dev_set_owner(struct vhost_dev *dev);
 bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] net, virtio_net: replace the magic value

2013-11-18 Thread Zhi Yong Wu
From: Zhi Yong Wu 

It is more appropriate to use # of queue pairs currently used by
the driver instead of a magic value.

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/net/virtio_net.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index cdc7c90..e0cb2d1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1619,8 +1619,8 @@ static int virtnet_probe(struct virtio_device *vdev)
if (err)
goto free_stats;
 
-   netif_set_real_num_tx_queues(dev, 1);
-   netif_set_real_num_rx_queues(dev, 1);
+   netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
+   netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
 
err = register_netdev(dev);
if (err) {
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] vhost: adjust vhost_dev_init() to be void

2013-11-18 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c   |4 ++--
 drivers/vhost/scsi.c  |2 +-
 drivers/vhost/test.c  |3 +--
 drivers/vhost/vhost.c |4 +---
 drivers/vhost/vhost.h |2 +-
 5 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 0554785..9a68409 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -683,7 +683,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r, i;
+   int i;
 
if (!n)
return -ENOMEM;
@@ -706,7 +706,7 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
}
-   r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 9d5e18d..e02b7df 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1417,7 +1417,7 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
vqs[i] = &vs->vqs[i].vq;
vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick;
}
-   r = vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
+   vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ);
 
tcm_vhost_init_inflight(vs, NULL);
 
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 99cb960..c2a54fb 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -104,7 +104,6 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
struct vhost_test *n = kmalloc(sizeof *n, GFP_KERNEL);
struct vhost_dev *dev;
struct vhost_virtqueue **vqs;
-   int r;
 
if (!n)
return -ENOMEM;
@@ -117,7 +116,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = &n->dev;
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
 
f->private_data = n;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 69068e0..78987e4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -290,7 +290,7 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev)
vhost_vq_free_iovecs(dev->vqs[i]);
 }
 
-long vhost_dev_init(struct vhost_dev *dev,
+void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs)
 {
struct vhost_virtqueue *vq;
@@ -319,8 +319,6 @@ long vhost_dev_init(struct vhost_dev *dev,
vhost_poll_init(&vq->poll, vq->handle_kick,
POLLIN, dev);
}
-
-   return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_dev_init);
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4465ed5..35eeb2a 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -127,7 +127,7 @@ struct vhost_dev {
struct task_struct *worker;
 };
 
-long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
+void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
 long vhost_dev_set_owner(struct vhost_dev *dev);
 bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] vhost: remove the dead branch

2013-11-18 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu 
Acked-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c  |5 -
 drivers/vhost/scsi.c |5 -
 drivers/vhost/test.c |5 -
 3 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 831eb4f..0554785 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -707,11 +707,6 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
n->vqs[i].sock_hlen = 0;
}
r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
-   if (r < 0) {
-   kfree(n);
-   kfree(vqs);
-   return r;
-   }
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index e663921..9d5e18d 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1421,14 +1421,9 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
 
tcm_vhost_init_inflight(vs, NULL);
 
-   if (r < 0)
-   goto err_init;
-
f->private_data = vs;
return 0;
 
-err_init:
-   kfree(vqs);
 err_vqs:
vhost_scsi_free(vs);
 err_vs:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 339eae8..99cb960 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -118,11 +118,6 @@ static int vhost_test_open(struct inode *inode, struct 
file *f)
vqs[VHOST_TEST_VQ] = &n->vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
r = vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX);
-   if (r < 0) {
-   kfree(vqs);
-   kfree(n);
-   return r;
-   }
 
f->private_data = n;
 
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] net, virtio_net: replace the magic value

2013-11-18 Thread Zhi Yong Wu
On Mon, Nov 18, 2013 at 6:15 PM, Michael S. Tsirkin  wrote:
> On Mon, Nov 18, 2013 at 06:07:45PM +0800, Zhi Yong Wu wrote:
>> On Mon, Nov 18, 2013 at 5:50 PM, Michael S. Tsirkin  wrote:
>> > On Mon, Nov 18, 2013 at 04:46:20PM +0800, Zhi Yong Wu wrote:
>> >> From: Zhi Yong Wu 
>> >>
>> >> It is more appropriate to use # of queue pairs currently used by
>> >> the driver instead of a magic value.
>> >>
>> >> Signed-off-by: Zhi Yong Wu 
>> >
>> > I don't mind, but driver should be submitted separately
>> > from qemu patches. As it is only patch 4/4 made it to netdev.
>> ok, i will sent v2. By the way, can you help take a look at the
>> following patches?
>
> Will do.
>
>> Maybe i can send their v2 together.
>
> Please don't, these seem to be completely unrelated.
OK, i will send it separately.
>
>> [PATCH 1/3] vhost: remove the dead branch
>> [PATCH 2/3] vhost: adjust vhost_dev_init() to be void
>> [PATCH 3/3] vhost: fix the wrong log descriptions
>>
>>
>> >
>> >> ---
>> >>  drivers/net/virtio_net.c |4 ++--
>> >>  1 files changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> >> index cdc7c90..e0cb2d1 100644
>> >> --- a/drivers/net/virtio_net.c
>> >> +++ b/drivers/net/virtio_net.c
>> >> @@ -1619,8 +1619,8 @@ static int virtnet_probe(struct virtio_device *vdev)
>> >>   if (err)
>> >>   goto free_stats;
>> >>
>> >> - netif_set_real_num_tx_queues(dev, 1);
>> >> - netif_set_real_num_rx_queues(dev, 1);
>> >> + netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
>> >> + netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
>> >>
>> >>   err = register_netdev(dev);
>> >>   if (err) {
>> >> --
>> >> 1.7.6.5
>> >>
>>
>>
>>
>> --
>> Regards,
>>
>> Zhi Yong Wu



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] net, virtio_net: replace the magic value

2013-11-18 Thread Zhi Yong Wu
On Mon, Nov 18, 2013 at 5:50 PM, Michael S. Tsirkin  wrote:
> On Mon, Nov 18, 2013 at 04:46:20PM +0800, Zhi Yong Wu wrote:
>> From: Zhi Yong Wu 
>>
>> It is more appropriate to use # of queue pairs currently used by
>> the driver instead of a magic value.
>>
>> Signed-off-by: Zhi Yong Wu 
>
> I don't mind, but driver should be submitted separately
> from qemu patches. As it is only patch 4/4 made it to netdev.
ok, i will sent v2. By the way, can you help take a look at the
following patches? Maybe i can send their v2 together.
[PATCH 1/3] vhost: remove the dead branch
[PATCH 2/3] vhost: adjust vhost_dev_init() to be void
[PATCH 3/3] vhost: fix the wrong log descriptions


>
>> ---
>>  drivers/net/virtio_net.c |4 ++--
>>  1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index cdc7c90..e0cb2d1 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1619,8 +1619,8 @@ static int virtnet_probe(struct virtio_device *vdev)
>>   if (err)
>>   goto free_stats;
>>
>> - netif_set_real_num_tx_queues(dev, 1);
>> - netif_set_real_num_rx_queues(dev, 1);
>> + netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
>> + netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
>>
>>   err = register_netdev(dev);
>>   if (err) {
>> --
>> 1.7.6.5
>>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] net, virtio_net: replace the magic value

2013-11-18 Thread Zhi Yong Wu
From: Zhi Yong Wu 

It is more appropriate to use # of queue pairs currently used by
the driver instead of a magic value.

Signed-off-by: Zhi Yong Wu 
---
 drivers/net/virtio_net.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index cdc7c90..e0cb2d1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1619,8 +1619,8 @@ static int virtnet_probe(struct virtio_device *vdev)
if (err)
goto free_stats;
 
-   netif_set_real_num_tx_queues(dev, 1);
-   netif_set_real_num_rx_queues(dev, 1);
+   netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
+   netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
 
err = register_netdev(dev);
if (err) {
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 00/11] VFS hot tracking

2013-11-13 Thread Zhi Yong Wu
Ping

On Wed, Nov 6, 2013 at 9:45 PM, Zhi Yong Wu  wrote:
> From: Zhi Yong Wu 
>
>   The patchset is trying to introduce hot tracking function in
> VFS layer, which will keep track of real disk I/O in memory.
> By it, you will easily know more details about disk I/O, and
> then detect where disk I/O hot spots are. Also, specific FS
> can take use of it to do accurate defragment, and hot relocation
> support, etc.
>
>   Now it's time to send out its V6 for external review, and
> any comments or ideas are appreciated, thanks.
>
> NOTE:
>
>   The patchset can be obtained via my kernel dev git on github:
> git://github.com/wuzhy/kernel.git hot_tracking
>   If you're interested, you can also review them via
> https://github.com/wuzhy/kernel/commits/hot_tracking
>
>   For how to use and more other info and performance report,
> please check hot_tracking.txt in Documentation and following
> links:
>   1.) http://lwn.net/Articles/525651/
>   2.) https://lkml.org/lkml/2012/12/20/199
>
>   This patchset has been done scalability or performance tests
> by fs_mark, ffsb and compilebench.
>
>   The perf testings were done on Linux 3.12.0-rc7 with Model IBM,8231-E2C
> Big Endian PPC64 with 64 CPUs and 2 NUMA nodes, 250G RAM and 1.50 TiB
> test hard disk where each test file size is 20G or 100G.
> Architecture:  ppc64
> Byte Order:Big Endian
> CPU(s):64
> On-line CPU(s) list:   0-63
> Thread(s) per core:4
> Core(s) per socket:1
> Socket(s): 16
> NUMA node(s):  2
> Model: IBM,8231-E2C
> Hypervisor vendor: pHyp
> Virtualization type:   full
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  4096K
> NUMA node0 CPU(s): 0-31
> NUMA node1 CPU(s): 32-63
>
>   Below is the perf testing report:
>
>   Please focus on the two key points:
>   - The overall overhead which is injected by the patchset
>   - The stability of the perf results
>
> 1. fio tests
>
> w/o hot tracking   w/ 
> hot tracking
>
> RAM size32G  32G 16G   8G 
>   4G   2G  250G
>
> sequential-8k-1jobs-read 61260KB/s60918KB/s60901KB/s
> 62610KB/s60992KB/s60213KB/s60948KB/s
>
> sequential-8k-1jobs-write 1329KB/s 1329KB/s 1328KB/s 
> 1329KB/s 1328KB/s 1329KB/s 1329KB/s
>
> sequential-8k-8jobs-read 91139KB/s92614KB/s90907KB/s
> 89895KB/s92022KB/s90851KB/s91877KB/s
>
> sequential-8k-8jobs-write 2523KB/s 2522KB/s 2516KB/s 
> 2521KB/s 2516KB/s 2518KB/s 2521KB/s
>
> sequential-256k-1jobs-read  151432KB/s   151403KB/s   151406KB/s   
> 151422KB/s   151344KB/s   151446KB/s   151372KB/s
>
> sequential-256k-1jobs-write  33451KB/s33470KB/s33481KB/s
> 33470KB/s33459KB/s33472KB/s33477KB/s
>
> sequential-256k-8jobs-read  235291KB/s   234555KB/s   234251KB/s   
> 233656KB/s   234927KB/s   236380KB/s   235535KB/s
>
> sequential-256k-8jobs-write  62419KB/s62402KB/s62191KB/s
> 62859KB/s62629KB/s62720KB/s62523KB/s
>
> random-io-mix-8k-1jobs  [READ]2929KB/s 2942KB/s 2946KB/s 
> 2929KB/s 2934KB/s 2947KB/s 2946KB/s
> [WRITE]   1262KB/s 1266KB/s 1257KB/s 
> 1262KB/s 1257KB/s 1257KB/s 1265KB/s
>
> random-io-mix-8k-8jobs  [READ]2444KB/s 2442KB/s 2436KB/s 
> 2416KB/s 2353KB/s 2441KB/s 2442KB/s
> [WRITE]   1047KB/s 1044KB/s 1047KB/s 
> 1028KB/s 1017KB/s 1034KB/s 1049KB/s
>
> random-io-mix-8k-16jobs [READ]2182KB/s 2184KB/s 2169KB/s 
> 2178KB/s 2190KB/s 2184KB/s 2180KB/s
> [WRITE]932KB/s  930KB/s  943KB/s  
> 936KB/s  937KB/s  929KB/s  931KB/s
>
> The above perf parameter is the aggregate bandwidth of threads in the group;
> If you hope to know how about other perf parameters, or fio raw results, 
> please let me know, thanks.
>
> 2. Locking stat - Contention & Cacheline Bouncing
>
> RAM size class name con-bounces  contentions  acq-bounces   
> acquisitions   cacheline bouncing  locking contention
>   
>ratio  ratio
>
>   &(&root->t_lock)->rlock:  15081592 157834  
> 3

Re: [PATCH v6 07/11] VFS hot tracking: Add a /proc interface to control memory usage

2013-11-12 Thread Zhi Yong Wu
On Wed, Nov 13, 2013 at 5:02 AM, Dave Hansen  wrote:
> On 11/12/2013 12:38 PM, Zhi Yong Wu wrote:
>> On Wed, Nov 13, 2013 at 1:05 AM, Dave Hansen  wrote:
>>> The on/off knob seems to me to be something better left to a mount
>>> option, not a global tunable.
>> If it is left to a mount option, the user or admin can't change it
>> *dynamically*.
>
> Really?
>
> man mount.  Look at "Mount options for tmpfs".  Try this on an existing
> tmpfs mount:
>
> mount -o remount,size=$foo tmpfsmount
>
> How would that be different from your tunable?
Is it light weight? I thought that remount will have more overhead and
effect on the applications running on filesystem.

>
>>> If this were true, why don't we have similar knobs for the dentry, inode
>>> and page caches?
>> This is not be controlled by memory controller(mem_cgroup)?
>
> That's a good point.  There is a 'kmem' cgroup controller for
> controlling the in-kernel structures (not page cache which is controlled
> by a separate one).  I believe the 'kmem' one would (could?) apply to
> the hot tracking data structures as well, which would obviate the need
> for this tunable.
>
> At least for the dentry and inode caches, they represent kernel-internal
> cache structures and are the same as your hot-data-tracking structures.
>  We don't have explicit /proc controls for the size of the dentry and
> inode caches, so I'm arguing that we should do the same for these new
> hot-data-tracking structures.
If 'kmem' cgroup controller is applied to VFS hot tracking, need we do
some additional coding work in kernel? If yes, we should put it to
TODO list. You know, we should push VFS hot tracking core to get
merged ASAP at first. Like this interface, we can develop and improve
it later.
I don't know what Viro's opinion is, If he also agree, we can really
put it to TODO list.


>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 07/11] VFS hot tracking: Add a /proc interface to control memory usage

2013-11-12 Thread Zhi Yong Wu
On Wed, Nov 13, 2013 at 1:05 AM, Dave Hansen  wrote:
> On 11/11/2013 02:45 PM, Zhi Yong Wu wrote:
>> On Tue, Nov 12, 2013 at 6:15 AM, Dave Hansen  wrote:
>>> In general, why do you have to control the number of these statically?
>> It gives the user or admin one optional chance to control the amount
>> of memory consumed by VFS hot tracking. And you can choose not to use
>> it.
>
> The on/off knob seems to me to be something better left to a mount
> option, not a global tunable.
If it is left to a mount option, the user or admin can't change it
*dynamically*.

>
>>> Shouldn't you just define a shrinker and let memory pressure determine
>>> how many of these we allow to exist?
>> How about if the user and admin hope to control the amount of the
>> memory consumed by VFS hot tracking? e.g. If the host has several
>> hundred of G or T memory, but the user or admin hope that the memory
>> size consumed by VFS hot tracking is under several G, In the case,
>> maybe a shrinker of VFS hot tracking will never be invoked by system
>> memory module, so this interface will make sense.
>
> If the shrinker is not invoked, that means that there is lots of memory
> free.  In the case that there is lots of memory free, are you arguing
> that a user would rather see memory go *unused* than be put to use for
> this hot tracking data?
At first, some user or admin has a lot of use cases which you can't imagine.
If he hope that the usage of memory consumed by VFS hot tracking
doesn't affect other key applications, how about it? This only give
one fine-grained control to the usage of memory consumed by VFS hot
tracking.
>
> If this were true, why don't we have similar knobs for the dentry, inode
> and page caches?
This is not be controlled by memory controller(mem_cgroup)?



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 07/11] VFS hot tracking: Add a /proc interface to control memory usage

2013-11-11 Thread Zhi Yong Wu
On Tue, Nov 12, 2013 at 6:15 AM, Dave Hansen  wrote:
> On 11/06/2013 05:45 AM, Zhi Yong Wu wrote:
>> Introduce a /proc interface hot-mem-high-thresh and
>> to cap the memory which is consumed by hot_inode_item
>> and hot_range_item, and they will be in the unit of
>> 1M bytes.
>
> You don't seem to have any documentation for this, btw... :(
>
>> + .procname   = "hot-mem-high-thresh",
>
> *Always* put units on these.  I know you mention it in a code comment,
> but please also include it in the proc filename too.
If you think it is better, i will add it.
>
> In general, why do you have to control the number of these statically?
It gives the user or admin one optional chance to control the amount
of memory consumed by VFS hot tracking. And you can choose not to use
it.
> Shouldn't you just define a shrinker and let memory pressure determine
> how many of these we allow to exist?
How about if the user and admin hope to control the amount of the
memory consumed by VFS hot tracking? e.g. If the host has several
hundred of G or T memory, but the user or admin hope that the memory
size consumed by VFS hot tracking is under several G, In the case,
maybe a shrinker of VFS hot tracking will never be invoked by system
memory module, so this interface will make sense.



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 00/11] VFS hot tracking

2013-11-11 Thread Zhi Yong Wu
ping? any plan to review?

On Wed, Nov 6, 2013 at 9:45 PM, Zhi Yong Wu  wrote:
> From: Zhi Yong Wu 
>
>   The patchset is trying to introduce hot tracking function in
> VFS layer, which will keep track of real disk I/O in memory.
> By it, you will easily know more details about disk I/O, and
> then detect where disk I/O hot spots are. Also, specific FS
> can take use of it to do accurate defragment, and hot relocation
> support, etc.
>
>   Now it's time to send out its V6 for external review, and
> any comments or ideas are appreciated, thanks.
>
> NOTE:
>
>   The patchset can be obtained via my kernel dev git on github:
> git://github.com/wuzhy/kernel.git hot_tracking
>   If you're interested, you can also review them via
> https://github.com/wuzhy/kernel/commits/hot_tracking
>
>   For how to use and more other info and performance report,
> please check hot_tracking.txt in Documentation and following
> links:
>   1.) http://lwn.net/Articles/525651/
>   2.) https://lkml.org/lkml/2012/12/20/199
>
>   This patchset has been done scalability or performance tests
> by fs_mark, ffsb and compilebench.
>
>   The perf testings were done on Linux 3.12.0-rc7 with Model IBM,8231-E2C
> Big Endian PPC64 with 64 CPUs and 2 NUMA nodes, 250G RAM and 1.50 TiB
> test hard disk where each test file size is 20G or 100G.
> Architecture:  ppc64
> Byte Order:Big Endian
> CPU(s):64
> On-line CPU(s) list:   0-63
> Thread(s) per core:4
> Core(s) per socket:1
> Socket(s): 16
> NUMA node(s):  2
> Model: IBM,8231-E2C
> Hypervisor vendor: pHyp
> Virtualization type:   full
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  4096K
> NUMA node0 CPU(s): 0-31
> NUMA node1 CPU(s): 32-63
>
>   Below is the perf testing report:
>
>   Please focus on the two key points:
>   - The overall overhead which is injected by the patchset
>   - The stability of the perf results
>
> 1. fio tests
>
> w/o hot tracking   w/ 
> hot tracking
>
> RAM size32G  32G 16G   8G 
>   4G   2G  250G
>
> sequential-8k-1jobs-read 61260KB/s60918KB/s60901KB/s
> 62610KB/s60992KB/s60213KB/s60948KB/s
>
> sequential-8k-1jobs-write 1329KB/s 1329KB/s 1328KB/s 
> 1329KB/s 1328KB/s 1329KB/s 1329KB/s
>
> sequential-8k-8jobs-read 91139KB/s92614KB/s90907KB/s
> 89895KB/s92022KB/s90851KB/s91877KB/s
>
> sequential-8k-8jobs-write 2523KB/s 2522KB/s 2516KB/s 
> 2521KB/s 2516KB/s 2518KB/s 2521KB/s
>
> sequential-256k-1jobs-read  151432KB/s   151403KB/s   151406KB/s   
> 151422KB/s   151344KB/s   151446KB/s   151372KB/s
>
> sequential-256k-1jobs-write  33451KB/s33470KB/s33481KB/s
> 33470KB/s33459KB/s33472KB/s33477KB/s
>
> sequential-256k-8jobs-read  235291KB/s   234555KB/s   234251KB/s   
> 233656KB/s   234927KB/s   236380KB/s   235535KB/s
>
> sequential-256k-8jobs-write  62419KB/s62402KB/s62191KB/s
> 62859KB/s62629KB/s62720KB/s62523KB/s
>
> random-io-mix-8k-1jobs  [READ]2929KB/s 2942KB/s 2946KB/s 
> 2929KB/s 2934KB/s 2947KB/s 2946KB/s
> [WRITE]   1262KB/s 1266KB/s 1257KB/s 
> 1262KB/s 1257KB/s 1257KB/s 1265KB/s
>
> random-io-mix-8k-8jobs  [READ]2444KB/s 2442KB/s 2436KB/s 
> 2416KB/s 2353KB/s 2441KB/s 2442KB/s
> [WRITE]   1047KB/s 1044KB/s 1047KB/s 
> 1028KB/s 1017KB/s 1034KB/s 1049KB/s
>
> random-io-mix-8k-16jobs [READ]2182KB/s 2184KB/s 2169KB/s 
> 2178KB/s 2190KB/s 2184KB/s 2180KB/s
> [WRITE]932KB/s  930KB/s  943KB/s  
> 936KB/s  937KB/s  929KB/s  931KB/s
>
> The above perf parameter is the aggregate bandwidth of threads in the group;
> If you hope to know how about other perf parameters, or fio raw results, 
> please let me know, thanks.
>
> 2. Locking stat - Contention & Cacheline Bouncing
>
> RAM size class name con-bounces  contentions  acq-bounces   
> acquisitions   cacheline bouncing  locking contention
>   
>ratio  ratio
>
>   &(&root->t_lock)->rlock:  15081592 1578

Re: [PATCH] update xfs maintainers

2013-11-08 Thread Zhi Yong Wu
On Sat, Nov 9, 2013 at 6:03 AM, Ben Myers  wrote:
> Hey Ric,
>
> On Fri, Nov 08, 2013 at 03:50:21PM -0500, Ric Wheeler wrote:
>> On 11/08/2013 03:46 PM, Ben Myers wrote:
>> >Hey Christoph,
>> >
>> >On Fri, Nov 08, 2013 at 11:34:24AM -0800, Christoph Hellwig wrote:
>> >>On Fri, Nov 08, 2013 at 12:03:37PM -0600, Ben Myers wrote:
>> >>>Mark is replacing Alex as my backup because Alex is really busy at
>> >>>Linaro and asked to be taken off awhile ago.  The holiday season is
>> >>>coming up and I fully intend to go off my meds, turn in to Fonzy the
>> >>>bear, and eat my hat.  I need someone to watch the shop while I'm off
>> >>>exploring on Mars.  I trust Mark to do that because he is totally
>> >>>awesome.
>> >>
>> >>Doing this as an unilateral decisions is not something that will win you
>> >>a fan base.
>> >It's posted for review.
>> >
>> >>While we never had anything reassembling a democracy in Linux Kernel
>> >>development making decisions without even contacting the major
>> >>contributor is wrong, twice so if the maintainer is a relatively minor
>> >>contributor to start with.
>> >>
>> >>Just because it recent came up elsewhere I'd like to recite the
>> >>definition from Trond here again:
>> >>
>> >>
>> >> http://lists.linux-foundation.org/pipermail/ksummit-2012-discuss/2012-June/66.html
>> >>
>> >>By many of the creative roles enlisted there it's clear that Dave should
>> >>be the maintainer.  He's been the main contributor and chief architect
>> >>for XFS for many year, while the maintainers came and went at the mercy
>> >>of SGI.  This is not meant to bad mouth either of you as I think you're
>> >>doing a reasonably good job compared to other maintainers, but at the
>> >>same time the direction is set by other people that have a much longer
>> >>involvement with the project, and having them officially in control
>> >>would help us forward a lot.  It would also avoid having to spend
>> >>considerable resources to train every new generation of SGI maintainer.
>> >>
>> >>Coming to and end I would like to maintain Dave Chinner as the primary
>> >>XFS maintainer for all the work he has done as biggest contributor and
>> >>architect of XFS since longer than I can remember, and I would love to
>> >>retain Ben Myers as a co-maintainer for all the good work he has done
>> >>maintaining and reviewing patches since November 2011.
>> >I think we're doing a decent job too.  So thanks for that much at least.  ;)
>> >>I would also like to use this post as a public venue to condemn the
>> >>unilateral smokey backroom decisions about XFS maintainership that SGI is
>> >>trying to enforce on the community.
>> >That really didn't happen Christoph.  It's not in my tree or in a pull 
>> >request.
>> >
>> >Linus, let me know what you want to do.  I do think we're doing a fair job 
>> >over
>> >here, and (geez) I'm just trying to add Mark as my backup since Alex is too
>> >busy.  I know the RH people want more control, and that's understandable, 
>> >but
>> >they really don't need to replace me to get their code in.  Ouch.
>> >
>> >Thanks,
>> > Ben
>>
>> Christoph is not a Red Hat person.
>>
>> Jeff is from Oracle.
>>
>> This is not a Red Hat vs SGI thing,
>
> Sorry if my read on that was wrong.
>
>> Dave simply has earned the right
>> to take on the formal leadership role of maintainer.
>
> Then we're gonna need some Reviewed-bys.  ;)
>
> From: Ben Myers 
>
> xfs: update maintainers
>
> Add Dave as maintainer of XFS.
>
> Signed-off-by: Ben Myers 
> ---
>  MAINTAINERS |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: b/MAINTAINERS
> ===
> --- a/MAINTAINERS   2013-11-08 15:20:18.935186245 -0600
> +++ b/MAINTAINERS   2013-11-08 15:22:50.685245977 -0600
> @@ -9387,8 +9387,8 @@ F:drivers/xen/*swiotlb*
>
>  XFS FILESYSTEM
>  P: Silicon Graphics Inc
> +M: Dave Chinner 
Use his personal private mail account? I guess that you should ask for
his opinion at first, or it is more appropriate that he submit this
patch by himself.

>  M: Ben Myers 
> -M: Alex Elder 
>  M: x...@oss.sgi.com
>  L: x...@oss.sgi.com
>  W: http://oss.sgi.com/projects/xfs
>
> ___
> xfs mailing list
> x...@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] mm, slub: fix the typo in mm/slub.c

2013-11-08 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 mm/slub.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index c3eb3d3..7a64327 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -155,7 +155,7 @@ static inline bool kmem_cache_has_cpu_partial(struct 
kmem_cache *s)
 /*
  * Maximum number of desirable partial slabs.
  * The existence of more partial slabs makes kmem_cache_shrink
- * sort the partial list by the number of objects in the.
+ * sort the partial list by the number of objects in use.
  */
 #define MAX_PARTIAL 10
 
@@ -2829,8 +2829,8 @@ static struct kmem_cache *kmem_cache_node;
  * slab on the node for this slabcache. There are no concurrent accesses
  * possible.
  *
- * Note that this function only works on the kmalloc_node_cache
- * when allocating for the kmalloc_node_cache. This is used for bootstrapping
+ * Note that this function only works on the kmem_cache_node
+ * when allocating for the kmem_cache_node. This is used for bootstrapping
  * memory on a fresh node that has no slab structures yet.
  */
 static void early_kmem_cache_node_alloc(int node)
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] mm, memory-failure: fix the typo in me_pagecache_dirty()

2013-11-08 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 mm/memory-failure.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index bf3351b..d8ec181 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -611,7 +611,7 @@ static int me_pagecache_clean(struct page *p, unsigned long 
pfn)
 }
 
 /*
- * Dirty cache page page
+ * Dirty cache page
  * Issues: when the error hit a hole page the error is not properly
  * propagated.
  */
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] mm, slub: fix the typo in include/linux/slub_def.h

2013-11-08 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Signed-off-by: Zhi Yong Wu 
---
 include/linux/slub_def.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index cc0b67e..f56bfa9 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -11,7 +11,7 @@
 enum stat_item {
ALLOC_FASTPATH, /* Allocation from cpu slab */
ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */
-   FREE_FASTPATH,  /* Free to cpu slub */
+   FREE_FASTPATH,  /* Free to cpu slab */
FREE_SLOWPATH,  /* Freeing not to cpu slab */
FREE_FROZEN,/* Freeing to frozen slab */
FREE_ADD_PARTIAL,   /* Freeing moves slab to partial list */
-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 03/11] VFS hot tracking: Add a workqueue to move items between hot maps

2013-11-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Add a workqueue per superblock and a delayed_work
to run periodic work to update map info on each superblock.

Two arrays of map list are defined, one is for hot inode
items, and the other is for hot extent items.

The hot items in the RB-tree will be at first distilled
into one temperature in the range [0, 255]. It will be
be linked to its corresponding array of map list which use
the temperature as its index.

Signed-off-by: Chandra Seetharaman 
Signed-off-by: Zhi Yong Wu 
---
 fs/hot_tracking.c| 208 +++
 fs/hot_tracking.h|  25 ++
 include/linux/hot_tracking.h |   8 +-
 3 files changed, 240 insertions(+), 1 deletion(-)

diff --git a/fs/hot_tracking.c b/fs/hot_tracking.c
index d68c458..35d3b83 100644
--- a/fs/hot_tracking.c
+++ b/fs/hot_tracking.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "hot_tracking.h"
 
 /* kmem_cache pointers for slab caches */
@@ -22,6 +23,7 @@ static void hot_range_item_init(struct hot_range_item *hr,
struct hot_inode_item *he, loff_t start)
 {
kref_init(&hr->refs);
+   INIT_LIST_HEAD(&hr->track_list);
hr->freq.avg_delta_reads = (u64) -1;
hr->freq.avg_delta_writes = (u64) -1;
hr->start = start;
@@ -41,8 +43,13 @@ static void hot_range_item_free(struct kref *kref)
 {
struct hot_range_item *hr = container_of(kref,
struct hot_range_item, refs);
+   struct hot_info *root = hr->hot_inode->hot_root;
 
rb_erase(&hr->rb_node, &hr->hot_inode->hot_range_tree);
+   spin_lock(&root->m_lock);
+   if (!list_empty(&hr->track_list))
+   list_del_init(&hr->track_list);
+   spin_unlock(&root->m_lock);
 
call_rcu(&hr->rcu, hot_range_item_free_cb);
 }
@@ -67,6 +74,8 @@ static struct hot_range_item
struct rb_node **p;
struct rb_node *parent = NULL;
struct hot_range_item *hr, *hr_new = NULL;
+   u32 temp;
+   u8 temp_cur;
 
start = start << RANGE_BITS;
 
@@ -100,6 +109,12 @@ redo:
if (hr_new) {
rb_link_node(&hr_new->rb_node, parent, p);
rb_insert_color(&hr_new->rb_node, &he->hot_range_tree);
+   temp = hot_temp_calc(&hr_new->freq);
+   temp_cur = (u8)(temp >> (32 - MAP_BITS));
+   spin_lock(&he->hot_root->m_lock);
+   list_add_tail(&hr_new->track_list,
+   &he->hot_root->hot_map[TYPE_RANGE][temp_cur]);
+   spin_unlock(&he->hot_root->m_lock);
hot_range_item_get(hr_new); /* For the caller */
spin_unlock(&he->i_lock);
return hr_new;
@@ -136,10 +151,49 @@ static void hot_range_tree_free(struct hot_inode_item *he)
spin_unlock(&he->i_lock);
 }
 
+static void hot_range_map_update(struct hot_info *root,
+   struct hot_range_item *hr)
+{
+   u32 temp = hot_temp_calc(&hr->freq);
+   u8 temp_cur = (u8)(temp >> (32 - MAP_BITS));
+   u8 temp_prev = (u8)(hr->freq.last_temp >> (32 - MAP_BITS));
+
+   spin_lock(&root->m_lock);
+   if (!list_empty(&hr->track_list)
+   && (temp_cur != temp_prev)) {
+   hr->freq.last_temp = temp;
+   list_del_init(&hr->track_list);
+   list_add_tail(&hr->track_list,
+   &root->hot_map[TYPE_RANGE][temp_cur]);
+   }
+   spin_unlock(&root->m_lock);
+}
+
+/*
+ * Update temperatures for each range item for aging purposes.
+ * If one hot range item is old, it will be aged out.
+ */
+static void hot_range_tree_update(struct hot_inode_item *he,
+   struct hot_info *root)
+{
+   struct rb_node *node;
+   struct hot_range_item *hr;
+
+   rcu_read_lock();
+   node = rb_first(&he->hot_range_tree);
+   while (node) {
+   hr = rb_entry(node, struct hot_range_item, rb_node);
+   node = rb_next(node);
+   hot_range_map_update(root, hr);
+   }
+   rcu_read_unlock();
+}
+
 static void hot_inode_item_init(struct hot_inode_item *he,
struct hot_info *root, u64 ino)
 {
kref_init(&he->refs);
+   INIT_LIST_HEAD(&he->track_list);
he->freq.avg_delta_reads = (u64) -1;
he->freq.avg_delta_writes = (u64) -1;
he->ino = ino;
@@ -161,6 +215,8 @@ static void hot_inode_item_free(struct kref *kref)
struct hot_inode_item, refs);
 
rb_erase(&he->rb_node, &he->hot_root->hot_inode_tree);
+   if (!list_empty(&he->track_list))
+   lis

[PATCH v6 01/11] VFS hot tracking: Define basic data structures and functions

2013-11-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

This patch includes the basic data structure and functions needed for
VFS hot tracking.

It adds hot_inode_tree struct to keep track of frequently accessed
files, and is keyed by {inode, offset}. Trees contain hot_inode_items
representing those files and hot_range_items representing ranges in that
file.

It defines a data structure hot_info, which is associated with a mounted
filesystem, and will be used to store the inode tree and range tree for
hot items pertaining to that filesystem.

Signed-off-by: Chandra Seetharaman 
Signed-off-by: Zhi Yong Wu 
---
 fs/Makefile  |   2 +-
 fs/dcache.c  |   2 +
 fs/hot_tracking.c| 227 +++
 fs/hot_tracking.h|  23 +
 include/linux/fs.h   |   4 +
 include/linux/hot_tracking.h |  66 +
 include/uapi/linux/fs.h  |   1 +
 7 files changed, 324 insertions(+), 1 deletion(-)
 create mode 100644 fs/hot_tracking.c
 create mode 100644 fs/hot_tracking.h
 create mode 100644 include/linux/hot_tracking.h

diff --git a/fs/Makefile b/fs/Makefile
index 4fe6df3..5f9b8f1 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -11,7 +11,7 @@ obj-y :=  open.o read_write.o file_table.o super.o \
attr.o bad_inode.o file.o filesystems.o namespace.o \
seq_file.o xattr.o libfs.o fs-writeback.o \
pnode.o splice.o sync.o utimes.o \
-   stack.o fs_struct.o statfs.o
+   stack.o fs_struct.o statfs.o hot_tracking.o
 
 ifeq ($(CONFIG_BLOCK),y)
 obj-y +=   buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o
diff --git a/fs/dcache.c b/fs/dcache.c
index ae6ebb8..40dfd63 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -40,6 +40,7 @@
 #include 
 #include "internal.h"
 #include "mount.h"
+#include "hot_tracking.h"
 
 /*
  * Usage:
@@ -3437,4 +3438,5 @@ void __init vfs_caches_init(unsigned long mempages)
mnt_init();
bdev_cache_init();
chrdev_init();
+   hot_cache_init();
 }
diff --git a/fs/hot_tracking.c b/fs/hot_tracking.c
new file mode 100644
index 000..25e7858
--- /dev/null
+++ b/fs/hot_tracking.c
@@ -0,0 +1,227 @@
+/*
+ * fs/hot_tracking.c
+ *
+ * Copyright (C) 2013 IBM Corp. All rights reserved.
+ * Written by Zhi Yong Wu 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include "hot_tracking.h"
+
+/* kmem_cache pointers for slab caches */
+static struct kmem_cache *hot_inode_item_cachep __read_mostly;
+static struct kmem_cache *hot_range_item_cachep __read_mostly;
+
+static void hot_range_item_init(struct hot_range_item *hr,
+   struct hot_inode_item *he, loff_t start)
+{
+   kref_init(&hr->refs);
+   hr->start = start;
+   hr->len = 1 << RANGE_BITS;
+   hr->hot_inode = he;
+}
+
+static void hot_range_item_free_cb(struct rcu_head *head)
+{
+   struct hot_range_item *hr = container_of(head,
+   struct hot_range_item, rcu);
+
+   kmem_cache_free(hot_range_item_cachep, hr);
+}
+
+static void hot_range_item_free(struct kref *kref)
+{
+   struct hot_range_item *hr = container_of(kref,
+   struct hot_range_item, refs);
+
+   rb_erase(&hr->rb_node, &hr->hot_inode->hot_range_tree);
+
+   call_rcu(&hr->rcu, hot_range_item_free_cb);
+}
+
+static void hot_range_item_get(struct hot_range_item *hr)
+{
+kref_get(&hr->refs);
+}
+
+/*
+ * Drops the reference out on hot_range_item by one
+ * and free the structure if the reference count hits zero
+ */
+static void hot_range_item_put(struct hot_range_item *hr)
+{
+kref_put(&hr->refs, hot_range_item_free);
+}
+
+/*
+ * Free the entire hot_range_tree.
+ */
+static void hot_range_tree_free(struct hot_inode_item *he)
+{
+   struct rb_node *node;
+   struct hot_range_item *hr;
+
+   /* Free hot inode and range trees on fs root */
+   spin_lock(&he->i_lock);
+   node = rb_first(&he->hot_range_tree);
+   while (node) {
+   hr = rb_entry(node, struct hot_range_item, rb_node);
+   node = rb_next(node);
+   hot_range_item_put(hr);
+   }
+   spin_unlock(&he->i_lock);
+}
+
+static void hot_inode_item_init(struct hot_inode_item *he,
+   struct hot_info *root, u64 ino)
+{
+   kref_init(&he->refs);
+   he->ino = ino;
+   he->hot_root = root;
+   spin_lock_init(&he->i_lock);
+}
+
+static void hot_inode_item_free_cb(struct rcu_head *head)
+{
+   struct hot_inode_item *he = container_of(head,
+   struct hot_inode_item, rcu);
+
+   kmem_cache_free(

[PATCH v6 09/11] VFS hot tracking, btrfs: Add hot tracking support

2013-11-06 Thread Zhi Yong Wu
From: Zhi Yong Wu 

Introduce one new mount option '-o hot_track',
and add its parsing support.

Its usage looks like:
   mount -o hot_track
   mount -o nouser,hot_track
   mount -o nouser,hot_track,loop
   mount -o hot_track,nouser

Reviewed-by: David Sterba 
Signed-off-by: Chandra Seetharaman 
Signed-off-by: Zhi Yong Wu 
---
 fs/btrfs/ctree.h |  1 +
 fs/btrfs/super.c | 22 +-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 0506f40..b8d8982 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1990,6 +1990,7 @@ struct btrfs_ioctl_defrag_range_args {
 #define BTRFS_MOUNT_CHECK_INTEGRITY_INCLUDING_EXTENT_DATA (1 << 21)
 #define BTRFS_MOUNT_PANIC_ON_FATAL_ERROR   (1 << 22)
 #define BTRFS_MOUNT_RESCAN_UUID_TREE   (1 << 23)
+#define BTRFS_MOUNT_HOT_TRACK  (1 << 24)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL  (30)
 
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index e913328..69fe31d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "compat.h"
 #include "delayed-inode.h"
 #include "ctree.h"
@@ -310,6 +311,10 @@ static void btrfs_put_super(struct super_block *sb)
 * last process that kept it busy.  Or segfault in the aforementioned
 * process...  Whom would you report that to?
 */
+
+   /* Hot data tracking */
+   if (btrfs_test_opt(btrfs_sb(sb)->tree_root, HOT_TRACK))
+   hot_track_exit(sb);
 }
 
 enum {
@@ -323,7 +328,7 @@ enum {
Opt_no_space_cache, Opt_recovery, Opt_skip_balance,
Opt_check_integrity, Opt_check_integrity_including_extent_data,
Opt_check_integrity_print_mask, Opt_fatal_errors, Opt_rescan_uuid_tree,
-   Opt_commit_interval,
+   Opt_commit_interval, Opt_hot_track,
Opt_err,
 };
 
@@ -366,6 +371,7 @@ static match_table_t tokens = {
{Opt_rescan_uuid_tree, "rescan_uuid_tree"},
{Opt_fatal_errors, "fatal_errors=%s"},
{Opt_commit_interval, "commit=%d"},
+   {Opt_hot_track, "hot_track"},
{Opt_err, NULL},
 };
 
@@ -676,6 +682,9 @@ int btrfs_parse_options(struct btrfs_root *root, char 
*options)
info->commit_interval = 
BTRFS_DEFAULT_COMMIT_INTERVAL;
}
break;
+   case Opt_hot_track:
+   btrfs_set_opt(info->mount_opt, HOT_TRACK);
+   break;
case Opt_err:
printk(KERN_INFO "btrfs: unrecognized mount option "
   "'%s'\n", p);
@@ -898,11 +907,20 @@ static int btrfs_fill_super(struct super_block *sb,
goto fail_close;
}
 
+   if (btrfs_test_opt(fs_info->tree_root, HOT_TRACK)) {
+   err = hot_track_init(sb);
+   if (err)
+   goto fail_hot;
+   }
+
save_mount_options(sb, data);
cleancache_init_fs(sb);
sb->s_flags |= MS_ACTIVE;
return 0;
 
+fail_hot:
+   dput(sb->s_root);
+   sb->s_root = NULL;
 fail_close:
close_ctree(fs_info->tree_root);
return err;
@@ -1014,6 +1032,8 @@ static int btrfs_show_options(struct seq_file *seq, 
struct dentry *dentry)
seq_puts(seq, ",fatal_errors=panic");
if (info->commit_interval != BTRFS_DEFAULT_COMMIT_INTERVAL)
seq_printf(seq, ",commit=%d", info->commit_interval);
+   if (btrfs_test_opt(root, HOT_TRACK))
+   seq_puts(seq, ",hot_track");
return 0;
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 10/11] VFS hot tracking, xfs: Add hot tracking support

2013-11-06 Thread Zhi Yong Wu
From: Dave Chinner 

Connect up the VFS hot tracking support so XFS filesystem
can make use of it.

Signed-off-by: Dave Chinner 
Signed-off-by: Zhi Yong Wu 
---
 fs/xfs/xfs_mount.h |  1 +
 fs/xfs/xfs_super.c | 18 ++
 2 files changed, 19 insertions(+)

diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 1fa0584..c6bbf31 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -184,6 +184,7 @@ typedef struct xfs_mount {
 #define XFS_MOUNT_WSYNC(1ULL << 0) /* for nfs - all 
metadata ops
   must be synchronous except
   for space allocations */
+#define XFS_MOUNT_HOTTRACK  (1ULL << 1) /* hot tracking */
 #define XFS_MOUNT_WAS_CLEAN(1ULL << 3)
 #define XFS_MOUNT_FS_SHUTDOWN  (1ULL << 4) /* atomic stop of all filesystem
   operations, typically for
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 15188cc..a2667f9 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -62,6 +62,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const struct super_operations xfs_super_operations;
 static kmem_zone_t *xfs_ioend_zone;
@@ -115,6 +116,7 @@ mempool_t *xfs_ioend_pool;
 #define MNTOPT_NODELAYLOG  "nodelaylog"/* Delayed logging disabled */
 #define MNTOPT_DISCARD"discard"/* Discard unused blocks */
 #define MNTOPT_NODISCARD   "nodiscard" /* Do not discard unused blocks */
+#define MNTOPT_HOTTRACK"hot_track"  /* hot tracking */
 
 /*
  * Table driven mount option parser.
@@ -381,6 +383,8 @@ xfs_parseargs(
mp->m_flags |= XFS_MOUNT_DISCARD;
} else if (!strcmp(this_char, MNTOPT_NODISCARD)) {
mp->m_flags &= ~XFS_MOUNT_DISCARD;
+   } else if (!strcmp(this_char, MNTOPT_HOTTRACK)) {
+   mp->m_flags |= XFS_MOUNT_HOTTRACK;
} else if (!strcmp(this_char, "ihashsize")) {
xfs_warn(mp,
"ihashsize no longer used, option is deprecated.");
@@ -504,6 +508,7 @@ xfs_showargs(
{ XFS_MOUNT_GRPID,  "," MNTOPT_GRPID },
{ XFS_MOUNT_DISCARD,"," MNTOPT_DISCARD },
{ XFS_MOUNT_SMALL_INUMS,"," MNTOPT_32BITINODE },
+   { XFS_MOUNT_HOTTRACK,   "," MNTOPT_HOTTRACK },
{ 0, NULL }
};
static struct proc_xfs_info xfs_info_unset[] = {
@@ -1046,6 +1051,9 @@ xfs_fs_put_super(
 {
struct xfs_mount*mp = XFS_M(sb);
 
+   if (mp->m_flags & XFS_MOUNT_HOTTRACK)
+   hot_track_exit(sb);
+
xfs_filestream_unmount(mp);
xfs_unmountfs(mp);
 
@@ -1501,8 +1509,18 @@ xfs_fs_fill_super(
goto out_unmount;
}
 
+   if (mp->m_flags & XFS_MOUNT_HOTTRACK) {
+   error = hot_track_init(sb);
+   if (error)
+   goto out_free_root;
+   }
+
return 0;
 
+ out_free_root:
+   dput(sb->s_root);
+   sb->s_root = NULL;
+
  out_filestream_unmount:
xfs_filestream_unmount(mp);
  out_free_sb:
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   >