Re: [PATCH] Time-based RFC 4122 UUID generator

2007-11-20 Thread Helge Deller
On Tuesday 20 November 2007, Andrew Morton wrote:
 On Sun, 18 Nov 2007 20:38:21 +0100 Helge Deller [EMAIL PROTECTED] wrote:
 
  Andrew,
  
  could you please consider adding this patch to your 2.6.25 patch series?
 
 please cc netdev on networking-related things

Ok.
 
  This is the third version of the patch in which I cleaned up and fixed 
  quite some stuff according to feedback from Ted.
  I assume this version is OK, since I didn't received any further feedback 
  since two weeks: http://lkml.org/lkml/2007/11/4/128.
  
  Thanks,
  Helge
  ---
  Title: Add time-based RFC 4122 UUID generator
  
  The current Linux kernel currently contains the generate_random_uuid() 
  function, which creates - based on RFC 4122 - truly random UUIDs and 
  provides them to userspace through /proc/sys/kernel/random/boot_id and 
  /proc/sys/kernel/random/uuid.
  
  This patch additionally adds the Time-based UUID variant of RFC 4122, 
  with which userspace applications can easily get real unique time-based 
  UUIDs through /proc/sys/kernel/random/uuid_time.
  A new /proc/sys/kernel/random/uuid_time_clockseq sysfs entry is available,
  so that the clock_seq value can be retained across system bootups (which
  is required by RFC 4122).
  
  The attached implementation uses getnstimeofday() to get very fine-grained
  granularity. This helps, so that userspace tools can get a lot more UUIDs 
  (if needed) per time than before.
  A mutex takes care of the proper locking against a mistaken double creation 
  of UUIDs for simultanious running processes.
  
  Signed-off-by: Helge Deller [EMAIL PROTECTED]
  
   drivers/char/random.c  |  205 
  -
   include/linux/sysctl.h |5 -
   2 files changed, 190 insertions(+), 20 deletions(-)
  
  diff --git a/drivers/char/random.c b/drivers/char/random.c
  index 5fee056..fc48c29 100644
  --- a/drivers/char/random.c
  +++ b/drivers/char/random.c
  @@ -6,6 +6,9 @@
* Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998, 1999.  All
* rights reserved.
*
  + * Time based UUID (RFC 4122) generator:
  + * Copyright Helge Deller [EMAIL PROTECTED], 2007
  + *
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
  @@ -239,6 +242,7 @@
   #include linux/spinlock.h
   #include linux/percpu.h
   #include linux/cryptohash.h
  +#include linux/if_arp.h
   
   #include asm/processor.h
   #include asm/uaccess.h
  @@ -1174,12 +1178,169 @@ EXPORT_SYMBOL(generate_random_uuid);
   static int min_read_thresh = 8, min_write_thresh;
   static int max_read_thresh = INPUT_POOL_WORDS * 32;
   static int max_write_thresh = INPUT_POOL_WORDS * 32;
  -static char sysctl_bootid[16];
  +static unsigned char sysctl_bootid[16] __read_mostly;
   
   /*
  - * These functions is used to return both the bootid UUID, and random
  - * UUID.  The difference is in whether table-data is NULL; if it is,
  - * then a new UUID is generated and returned to the user.
  + * Helper functions and variables for time based UUID generator
  + */
  +static unsigned int clock_seq;
  +static const unsigned int clock_seq_max = 0x3fff; /* 14 bits */
 
 There isn't a lot of point in `static const'.  Hopefully the compiler will
 do the right thing with it (use literal constant and elide the storage if
 nothing takes its address) but why not just do #define UPPER_CASE_THING in
 the time-homoured manner?

Thanks. I'll fix this with a #define.
 
  +static int clock_seq_initialized __read_mostly;
  +
  +static void init_clockseq(void)
  +{
  +   get_random_bytes(clock_seq, sizeof(clock_seq));
  +   clock_seq = clock_seq_max;
  +   clock_seq_initialized = 1;
  +}
  +
  +static int proc_dointvec_clockseq(struct ctl_table *table, int write,
  +   struct file *filp, void __user *buffer,
  +   size_t *lenp, loff_t *ppos)
  +{
  +   int ret;
  +
  +   if (!write  !clock_seq_initialized)
  +   init_clockseq();
 
 Seems there's a straightfroward race here where multiple tasks can run
 init_clockseq() concurrently.
 
 Can't we use a regular initcall here and make init_clockseq() __init?  I
 guess that would cast doubt over the quality of the thing which
 get_random_bytes() returned, but that's already the case - super-early
 userspace in initramfs could mount /proc and trigger this call.  We end up
 with a predictable sequence number?

Ok, I'll change that to an initcall. 
It's possible for userspace to write a new random new clock_seq value to
/proc/sys/kernel/random/uuid_time_clockseq at any time at bootup, although
this shouldn't be necessary at all if the clock is correct at the time the first
UUID is asked for.
So, in principle it shouldn't be a big problem if it's an initcall.

  +   ret = proc_dointvec(table, write, filp, buffer, lenp, ppos);
  +
  +   if (write  ret = 0) {
  +   clock_seq_initialized = 1;
  +   clock_seq = clock_seq_max;
  +   }
 

Re: [PATCH] Time-based RFC 4122 UUID generator

2007-11-19 Thread Andrew Morton
On Sun, 18 Nov 2007 20:38:21 +0100 Helge Deller [EMAIL PROTECTED] wrote:

 Andrew,
 
 could you please consider adding this patch to your 2.6.25 patch series?

please cc netdev on networking-related things

 This is the third version of the patch in which I cleaned up and fixed quite 
 some stuff according to feedback from Ted.
 I assume this version is OK, since I didn't received any further feedback 
 since two weeks: http://lkml.org/lkml/2007/11/4/128.
 
 Thanks,
 Helge
 ---
 Title: Add time-based RFC 4122 UUID generator
 
 The current Linux kernel currently contains the generate_random_uuid() 
 function, which creates - based on RFC 4122 - truly random UUIDs and 
 provides them to userspace through /proc/sys/kernel/random/boot_id and 
 /proc/sys/kernel/random/uuid.
 
 This patch additionally adds the Time-based UUID variant of RFC 4122, 
 with which userspace applications can easily get real unique time-based 
 UUIDs through /proc/sys/kernel/random/uuid_time.
 A new /proc/sys/kernel/random/uuid_time_clockseq sysfs entry is available,
 so that the clock_seq value can be retained across system bootups (which
 is required by RFC 4122).
 
 The attached implementation uses getnstimeofday() to get very fine-grained
 granularity. This helps, so that userspace tools can get a lot more UUIDs 
 (if needed) per time than before.
 A mutex takes care of the proper locking against a mistaken double creation 
 of UUIDs for simultanious running processes.
 
 Signed-off-by: Helge Deller [EMAIL PROTECTED]
 
  drivers/char/random.c  |  205 
 -
  include/linux/sysctl.h |5 -
  2 files changed, 190 insertions(+), 20 deletions(-)
 
 diff --git a/drivers/char/random.c b/drivers/char/random.c
 index 5fee056..fc48c29 100644
 --- a/drivers/char/random.c
 +++ b/drivers/char/random.c
 @@ -6,6 +6,9 @@
   * Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998, 1999.  All
   * rights reserved.
   *
 + * Time based UUID (RFC 4122) generator:
 + * Copyright Helge Deller [EMAIL PROTECTED], 2007
 + *
   * Redistribution and use in source and binary forms, with or without
   * modification, are permitted provided that the following conditions
   * are met:
 @@ -239,6 +242,7 @@
  #include linux/spinlock.h
  #include linux/percpu.h
  #include linux/cryptohash.h
 +#include linux/if_arp.h
  
  #include asm/processor.h
  #include asm/uaccess.h
 @@ -1174,12 +1178,169 @@ EXPORT_SYMBOL(generate_random_uuid);
  static int min_read_thresh = 8, min_write_thresh;
  static int max_read_thresh = INPUT_POOL_WORDS * 32;
  static int max_write_thresh = INPUT_POOL_WORDS * 32;
 -static char sysctl_bootid[16];
 +static unsigned char sysctl_bootid[16] __read_mostly;
  
  /*
 - * These functions is used to return both the bootid UUID, and random
 - * UUID.  The difference is in whether table-data is NULL; if it is,
 - * then a new UUID is generated and returned to the user.
 + * Helper functions and variables for time based UUID generator
 + */
 +static unsigned int clock_seq;
 +static const unsigned int clock_seq_max = 0x3fff; /* 14 bits */

There isn't a lot of point in `static const'.  Hopefully the compiler will
do the right thing with it (use literal constant and elide the storage if
nothing takes its address) but why not just do #define UPPER_CASE_THING in
the time-homoured manner?

 +static int clock_seq_initialized __read_mostly;
 +
 +static void init_clockseq(void)
 +{
 + get_random_bytes(clock_seq, sizeof(clock_seq));
 + clock_seq = clock_seq_max;
 + clock_seq_initialized = 1;
 +}
 +
 +static int proc_dointvec_clockseq(struct ctl_table *table, int write,
 + struct file *filp, void __user *buffer,
 + size_t *lenp, loff_t *ppos)
 +{
 + int ret;
 +
 + if (!write  !clock_seq_initialized)
 + init_clockseq();

Seems there's a straightfroward race here where multiple tasks can run
init_clockseq() concurrently.

Can't we use a regular initcall here and make init_clockseq() __init?  I
guess that would cast doubt over the quality of the thing which
get_random_bytes() returned, but that's already the case - super-early
userspace in initramfs could mount /proc and trigger this call.  We end up
with a predictable sequence number?

 + ret = proc_dointvec(table, write, filp, buffer, lenp, ppos);
 +
 + if (write  ret = 0) {
 + clock_seq_initialized = 1;
 + clock_seq = clock_seq_max;
 + }
 +
 + return ret;
 +}
 +
 +/*
 + * Generate time based UUID (RFC 4122)
 + *
 + * This function is protected with a mutex to ensure system-wide
 + * uniqiness of the new time based UUID.
 + */
 +static void generate_random_uuid_time(unsigned char uuid_out[16])
 +{
 + static DEFINE_MUTEX(uuid_mutex);
 + static u64 last_time_all;
 + static unsigned int clock_seq_started;
 + static unsigned char last_mac[ETH_ALEN];
 +
 + struct timespec ts;
 + u64 time_all;
 + unsigned char *found_mac = NULL;
 + struct