Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread David Miller
From: Chris Metcalf 
Date: Thu, 14 Nov 2013 12:09:21 -0500

> In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
> bug for ia64 by converting a 64-bit write to use put_unaligned().
> Unfortunately, since gcc will convert a short memset() to a series
> of appropriately-aligned stores, the problem is now visible again
> on tilegx, where the memset that zeros out proc_event is converted
> to three 64-bit stores, causing an unaligned access panic.
> 
> A better fix for the original problem is to ensure that proc_event
> is aligned to 8 bytes here.  We can do that relatively easily by
> arranging to start the struct cn_msg aligned to 8 bytes and then
> offset by 4 bytes.  Doing so means that the immediately following
> proc_event structure is then correctly aligned to 8 bytes.
> 
> The result is that the memset() stores are now aligned, and as an
> added benefit, we can remove the put_unaligned() calls in the code.
> 
> Signed-off-by: Chris Metcalf 

This looks fine to me, applied and queued up for -stable, thanks
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Chris Metcalf
On 11/14/2013 2:45 PM, Pete Zaitcev wrote:
> On Thu, 14 Nov 2013 12:09:21 -0500
> Chris Metcalf  wrote:
>
>> -__u8 buffer[CN_PROC_MSG_SIZE];
>> +__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
>> -msg = (struct cn_msg *)buffer;
>> +msg = buffer_to_cn_msg(buffer);
>>  ev = (struct proc_event *)msg->data;
>>  memset(>event_data, 0, sizeof(ev->event_data));
> Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable?

That would be fine from a correctness point of view; I'm happy
either way.  My patch nominally has better performance, for
what that's worth, since the memset() call is for a smaller
range (24 bytes instead of 60).  It also avoids the need for
put_unaligned(), which even on platforms that allow unaligned
stores can still be slower.

I can certainly do a v2 with the larger memset() instead if
that's the consensus.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Pete Zaitcev
On Thu, 14 Nov 2013 12:09:21 -0500
Chris Metcalf  wrote:

> - __u8 buffer[CN_PROC_MSG_SIZE];
> + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);

> - msg = (struct cn_msg *)buffer;
> + msg = buffer_to_cn_msg(buffer);
>   ev = (struct proc_event *)msg->data;
>   memset(>event_data, 0, sizeof(ev->event_data));

Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable?

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Chris Metcalf
In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
bug for ia64 by converting a 64-bit write to use put_unaligned().
Unfortunately, since gcc will convert a short memset() to a series
of appropriately-aligned stores, the problem is now visible again
on tilegx, where the memset that zeros out proc_event is converted
to three 64-bit stores, causing an unaligned access panic.

A better fix for the original problem is to ensure that proc_event
is aligned to 8 bytes here.  We can do that relatively easily by
arranging to start the struct cn_msg aligned to 8 bytes and then
offset by 4 bytes.  Doing so means that the immediately following
proc_event structure is then correctly aligned to 8 bytes.

The result is that the memset() stores are now aligned, and as an
added benefit, we can remove the put_unaligned() calls in the code.

Signed-off-by: Chris Metcalf 
---
 drivers/connector/cn_proc.c | 72 ++---
 1 file changed, 42 insertions(+), 30 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index c73fc2b74de2..18c5b9b16645 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -32,11 +32,23 @@
 #include 
 #include 
 
-#include 
-
 #include 
 
-#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event))
+/*
+ * Size of a cn_msg followed by a proc_event structure.  Since the
+ * sizeof struct cn_msg is a multiple of 4 bytes, but not 8 bytes, we
+ * add one 4-byte word to the size here, and then start the actual
+ * cn_msg structure 4 bytes into the stack buffer.  The result is that
+ * the immediately following proc_event structure is aligned to 8 bytes.
+ */
+#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event) + 
4)
+
+/* See comment above; we test our assumption about sizeof struct cn_msg here. 
*/
+static inline struct cn_msg *buffer_to_cn_msg(__u8 *buffer)
+{
+   BUILD_BUG_ON(sizeof(struct cn_msg) != 20);
+   return (struct cn_msg *)(buffer + 4);
+}
 
 static atomic_t proc_event_num_listeners = ATOMIC_INIT(0);
 static struct cb_id cn_proc_event_id = { CN_IDX_PROC, CN_VAL_PROC };
@@ -56,19 +68,19 @@ void proc_fork_connector(struct task_struct *task)
 {
struct cn_msg *msg;
struct proc_event *ev;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
struct timespec ts;
struct task_struct *parent;
 
if (atomic_read(_event_num_listeners) < 1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg->data;
memset(>event_data, 0, sizeof(ev->event_data));
get_seq(>seq, >cpu);
ktime_get_ts(); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(), (__u64 *)>timestamp_ns);
+   ev->timestamp_ns = timespec_to_ns();
ev->what = PROC_EVENT_FORK;
rcu_read_lock();
parent = rcu_dereference(task->real_parent);
@@ -91,17 +103,17 @@ void proc_exec_connector(struct task_struct *task)
struct cn_msg *msg;
struct proc_event *ev;
struct timespec ts;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
 
if (atomic_read(_event_num_listeners) < 1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg->data;
memset(>event_data, 0, sizeof(ev->event_data));
get_seq(>seq, >cpu);
ktime_get_ts(); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(), (__u64 *)>timestamp_ns);
+   ev->timestamp_ns = timespec_to_ns();
ev->what = PROC_EVENT_EXEC;
ev->event_data.exec.process_pid = task->pid;
ev->event_data.exec.process_tgid = task->tgid;
@@ -117,14 +129,14 @@ void proc_id_connector(struct task_struct *task, int 
which_id)
 {
struct cn_msg *msg;
struct proc_event *ev;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
struct timespec ts;
const struct cred *cred;
 
if (atomic_read(_event_num_listeners) < 1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg->data;
memset(>event_data, 0, sizeof(ev->event_data));
ev->what = which_id;
@@ -145,7 +157,7 @@ void proc_id_connector(struct task_struct *task, int 
which_id)
rcu_read_unlock();
get_seq(>seq, >cpu);
ktime_get_ts(); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(), (__u64 *)>timestamp_ns);
+   ev->timestamp_ns = timespec_to_ns();
 
memcpy(>id, _proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
@@ -159,17 +171,17 @@ void proc_sid_connector(struct task_struct *task)
struct cn_msg *msg;
struct 

[PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Chris Metcalf
In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
bug for ia64 by converting a 64-bit write to use put_unaligned().
Unfortunately, since gcc will convert a short memset() to a series
of appropriately-aligned stores, the problem is now visible again
on tilegx, where the memset that zeros out proc_event is converted
to three 64-bit stores, causing an unaligned access panic.

A better fix for the original problem is to ensure that proc_event
is aligned to 8 bytes here.  We can do that relatively easily by
arranging to start the struct cn_msg aligned to 8 bytes and then
offset by 4 bytes.  Doing so means that the immediately following
proc_event structure is then correctly aligned to 8 bytes.

The result is that the memset() stores are now aligned, and as an
added benefit, we can remove the put_unaligned() calls in the code.

Signed-off-by: Chris Metcalf cmetc...@tilera.com
---
 drivers/connector/cn_proc.c | 72 ++---
 1 file changed, 42 insertions(+), 30 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index c73fc2b74de2..18c5b9b16645 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -32,11 +32,23 @@
 #include linux/atomic.h
 #include linux/pid_namespace.h
 
-#include asm/unaligned.h
-
 #include linux/cn_proc.h
 
-#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event))
+/*
+ * Size of a cn_msg followed by a proc_event structure.  Since the
+ * sizeof struct cn_msg is a multiple of 4 bytes, but not 8 bytes, we
+ * add one 4-byte word to the size here, and then start the actual
+ * cn_msg structure 4 bytes into the stack buffer.  The result is that
+ * the immediately following proc_event structure is aligned to 8 bytes.
+ */
+#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event) + 
4)
+
+/* See comment above; we test our assumption about sizeof struct cn_msg here. 
*/
+static inline struct cn_msg *buffer_to_cn_msg(__u8 *buffer)
+{
+   BUILD_BUG_ON(sizeof(struct cn_msg) != 20);
+   return (struct cn_msg *)(buffer + 4);
+}
 
 static atomic_t proc_event_num_listeners = ATOMIC_INIT(0);
 static struct cb_id cn_proc_event_id = { CN_IDX_PROC, CN_VAL_PROC };
@@ -56,19 +68,19 @@ void proc_fork_connector(struct task_struct *task)
 {
struct cn_msg *msg;
struct proc_event *ev;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
struct timespec ts;
struct task_struct *parent;
 
if (atomic_read(proc_event_num_listeners)  1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg-data;
memset(ev-event_data, 0, sizeof(ev-event_data));
get_seq(msg-seq, ev-cpu);
ktime_get_ts(ts); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(ts), (__u64 *)ev-timestamp_ns);
+   ev-timestamp_ns = timespec_to_ns(ts);
ev-what = PROC_EVENT_FORK;
rcu_read_lock();
parent = rcu_dereference(task-real_parent);
@@ -91,17 +103,17 @@ void proc_exec_connector(struct task_struct *task)
struct cn_msg *msg;
struct proc_event *ev;
struct timespec ts;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
 
if (atomic_read(proc_event_num_listeners)  1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg-data;
memset(ev-event_data, 0, sizeof(ev-event_data));
get_seq(msg-seq, ev-cpu);
ktime_get_ts(ts); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(ts), (__u64 *)ev-timestamp_ns);
+   ev-timestamp_ns = timespec_to_ns(ts);
ev-what = PROC_EVENT_EXEC;
ev-event_data.exec.process_pid = task-pid;
ev-event_data.exec.process_tgid = task-tgid;
@@ -117,14 +129,14 @@ void proc_id_connector(struct task_struct *task, int 
which_id)
 {
struct cn_msg *msg;
struct proc_event *ev;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
struct timespec ts;
const struct cred *cred;
 
if (atomic_read(proc_event_num_listeners)  1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg-data;
memset(ev-event_data, 0, sizeof(ev-event_data));
ev-what = which_id;
@@ -145,7 +157,7 @@ void proc_id_connector(struct task_struct *task, int 
which_id)
rcu_read_unlock();
get_seq(msg-seq, ev-cpu);
ktime_get_ts(ts); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(ts), (__u64 *)ev-timestamp_ns);
+   ev-timestamp_ns = timespec_to_ns(ts);
 
memcpy(msg-id, cn_proc_event_id, sizeof(msg-id));
msg-ack = 0; /* not 

Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Pete Zaitcev
On Thu, 14 Nov 2013 12:09:21 -0500
Chris Metcalf cmetc...@tilera.com wrote:

 - __u8 buffer[CN_PROC_MSG_SIZE];
 + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);

 - msg = (struct cn_msg *)buffer;
 + msg = buffer_to_cn_msg(buffer);
   ev = (struct proc_event *)msg-data;
   memset(ev-event_data, 0, sizeof(ev-event_data));

Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable?

-- Pete
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Chris Metcalf
On 11/14/2013 2:45 PM, Pete Zaitcev wrote:
 On Thu, 14 Nov 2013 12:09:21 -0500
 Chris Metcalf cmetc...@tilera.com wrote:

 -__u8 buffer[CN_PROC_MSG_SIZE];
 +__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
 -msg = (struct cn_msg *)buffer;
 +msg = buffer_to_cn_msg(buffer);
  ev = (struct proc_event *)msg-data;
  memset(ev-event_data, 0, sizeof(ev-event_data));
 Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable?

That would be fine from a correctness point of view; I'm happy
either way.  My patch nominally has better performance, for
what that's worth, since the memset() call is for a smaller
range (24 bytes instead of 60).  It also avoids the need for
put_unaligned(), which even on platforms that allow unaligned
stores can still be slower.

I can certainly do a v2 with the larger memset() instead if
that's the consensus.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread David Miller
From: Chris Metcalf cmetc...@tilera.com
Date: Thu, 14 Nov 2013 12:09:21 -0500

 In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
 bug for ia64 by converting a 64-bit write to use put_unaligned().
 Unfortunately, since gcc will convert a short memset() to a series
 of appropriately-aligned stores, the problem is now visible again
 on tilegx, where the memset that zeros out proc_event is converted
 to three 64-bit stores, causing an unaligned access panic.
 
 A better fix for the original problem is to ensure that proc_event
 is aligned to 8 bytes here.  We can do that relatively easily by
 arranging to start the struct cn_msg aligned to 8 bytes and then
 offset by 4 bytes.  Doing so means that the immediately following
 proc_event structure is then correctly aligned to 8 bytes.
 
 The result is that the memset() stores are now aligned, and as an
 added benefit, we can remove the put_unaligned() calls in the code.
 
 Signed-off-by: Chris Metcalf cmetc...@tilera.com

This looks fine to me, applied and queued up for -stable, thanks
Chris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/