[PATCH v6] mmc: block: Add write packing control

2013-04-29 Thread Maya Erez
The write packing control will ensure that read requests latency is
not increased due to long write packed commands.

The trigger for enabling the write packing is calculated by the relation
between the number of potential packed write requests and the mean
value of all previous potential values:
If the current potential is greater than the mean potential then
the heuristic is that the following workload will contain many write
requests, therefore we lower the packed trigger. In the opposite case
we want to increase the trigger in order to get less packing events.
The trigger for disabling the write packing is fetching a read request.

Signed-off-by: Maya Erez 
Signed-off-by: Lee Susman 
---
Our experiments showed that the write packing can increase the worst case read 
latency.
Since the read latency is critical for user experience we added a write packing 
control
mechanism that disables the write packing in case of read requests.
This will ensure that read requests latency is not increased due to long write 
packed commands.

The trigger for enabling the write packing is managing to pack several write 
requests.
The number of potential packed requests that will trigger the packing can be 
configured via sysfs.
The trigger for disabling the write packing is a fetch of a read request.

Changes in v6:
- Dynamic calculation of the trigger for enabling te write packing (instead 
of a hardcoded value)

Changes in v5:
- Revert v4 changes
- fix the device attribute removal in case of failure of device_create_file

Changes in v4:
- Move MMC specific attributes to mmc sub-directory

Changes in v3:
- Fix the settings of num_of_potential_packed_wr_reqs

Changes in v2:
- Move the attribute for setting the packing enabling trigger to the block 
device
- Add documentation of the new attribute
---
 drivers/mmc/card/block.c |  131 ++
 drivers/mmc/card/queue.c |8 +++
 drivers/mmc/card/queue.h |3 +
 include/linux/mmc/host.h |1 +
 4 files changed, 143 insertions(+), 0 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index e12a03c..e0ed0b4 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -64,6 +64,13 @@ MODULE_ALIAS("mmc:block");
  (rq_data_dir(req) == WRITE))
 #define PACKED_CMD_VER 0x01
 #define PACKED_CMD_WR  0x02
+#define PACKED_TRIGGER_MAX_ELEMENTS5000
+#define PCKD_TRGR_INIT_MEAN_POTEN  17
+#define PCKD_TRGR_POTEN_LOWER_BOUND5
+#define PCKD_TRGR_URGENT_PENALTY   2
+#define PCKD_TRGR_LOWER_BOUND  5
+#define PCKD_TRGR_PRECISION_MULTIPLIER 100
+
 
 static DEFINE_MUTEX(block_mutex);
 
@@ -1405,6 +1412,122 @@ static inline u8 mmc_calc_packed_hdr_segs(struct 
request_queue *q,
return nr_segs;
 }
 
+static int get_packed_trigger(int potential, struct mmc_card *card,
+ struct request *req, int curr_trigger)
+{
+   static int num_mean_elements = 1;
+   static unsigned long mean_potential = PCKD_TRGR_INIT_MEAN_POTEN;
+   unsigned int trigger = curr_trigger;
+   unsigned int pckd_trgr_upper_bound = card->ext_csd.max_packed_writes;
+
+   /* scale down the upper bound to 75% */
+   pckd_trgr_upper_bound = (pckd_trgr_upper_bound * 3) / 4;
+
+   /*
+* since the most common calls for this function are with small
+* potential write values and since we don't want these calls to affect
+* the packed trigger, set a lower bound and ignore calls with
+* potential lower than that bound
+*/
+   if (potential <= PCKD_TRGR_POTEN_LOWER_BOUND)
+   return trigger;
+
+   /*
+* this is to prevent integer overflow in the following calculation:
+* once every PACKED_TRIGGER_MAX_ELEMENTS reset the algorithm
+*/
+   if (num_mean_elements > PACKED_TRIGGER_MAX_ELEMENTS) {
+   num_mean_elements = 1;
+   mean_potential = PCKD_TRGR_INIT_MEAN_POTEN;
+   }
+
+   /*
+* get next mean value based on previous mean value and current
+* potential packed writes. Calculation is as follows:
+* mean_pot[i+1] =
+*  ((mean_pot[i] * num_mean_elem) + potential)/(num_mean_elem + 1)
+*/
+   mean_potential *= num_mean_elements;
+   /*
+* add num_mean_elements so that the division of two integers doesn't
+* lower mean_potential too much
+*/
+   if (potential > mean_potential)
+   mean_potential += num_mean_elements;
+   mean_potential += potential;
+   /* this is for gaining more precision when dividing two integers */
+   mean_potential *= PCKD_TRGR_PRECISION_MULTIPLIER;
+   /* this completes the mean calculation */
+   mean_potential /= ++num_mean_elements;
+   mean_potential /= PCKD_TRGR_PRECISION_MULTIPLIER;
+
+   /*
+* if current potential packed writes is greater than the 

[PATCH v6] mmc: block: Add write packing control

2013-04-29 Thread Maya Erez
The write packing control will ensure that read requests latency is
not increased due to long write packed commands.

The trigger for enabling the write packing is calculated by the relation
between the number of potential packed write requests and the mean
value of all previous potential values:
If the current potential is greater than the mean potential then
the heuristic is that the following workload will contain many write
requests, therefore we lower the packed trigger. In the opposite case
we want to increase the trigger in order to get less packing events.
The trigger for disabling the write packing is fetching a read request.

Signed-off-by: Maya Erez me...@codeaurora.org
Signed-off-by: Lee Susman lsus...@codeaurora.org
---
Our experiments showed that the write packing can increase the worst case read 
latency.
Since the read latency is critical for user experience we added a write packing 
control
mechanism that disables the write packing in case of read requests.
This will ensure that read requests latency is not increased due to long write 
packed commands.

The trigger for enabling the write packing is managing to pack several write 
requests.
The number of potential packed requests that will trigger the packing can be 
configured via sysfs.
The trigger for disabling the write packing is a fetch of a read request.

Changes in v6:
- Dynamic calculation of the trigger for enabling te write packing (instead 
of a hardcoded value)

Changes in v5:
- Revert v4 changes
- fix the device attribute removal in case of failure of device_create_file

Changes in v4:
- Move MMC specific attributes to mmc sub-directory

Changes in v3:
- Fix the settings of num_of_potential_packed_wr_reqs

Changes in v2:
- Move the attribute for setting the packing enabling trigger to the block 
device
- Add documentation of the new attribute
---
 drivers/mmc/card/block.c |  131 ++
 drivers/mmc/card/queue.c |8 +++
 drivers/mmc/card/queue.h |3 +
 include/linux/mmc/host.h |1 +
 4 files changed, 143 insertions(+), 0 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index e12a03c..e0ed0b4 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -64,6 +64,13 @@ MODULE_ALIAS(mmc:block);
  (rq_data_dir(req) == WRITE))
 #define PACKED_CMD_VER 0x01
 #define PACKED_CMD_WR  0x02
+#define PACKED_TRIGGER_MAX_ELEMENTS5000
+#define PCKD_TRGR_INIT_MEAN_POTEN  17
+#define PCKD_TRGR_POTEN_LOWER_BOUND5
+#define PCKD_TRGR_URGENT_PENALTY   2
+#define PCKD_TRGR_LOWER_BOUND  5
+#define PCKD_TRGR_PRECISION_MULTIPLIER 100
+
 
 static DEFINE_MUTEX(block_mutex);
 
@@ -1405,6 +1412,122 @@ static inline u8 mmc_calc_packed_hdr_segs(struct 
request_queue *q,
return nr_segs;
 }
 
+static int get_packed_trigger(int potential, struct mmc_card *card,
+ struct request *req, int curr_trigger)
+{
+   static int num_mean_elements = 1;
+   static unsigned long mean_potential = PCKD_TRGR_INIT_MEAN_POTEN;
+   unsigned int trigger = curr_trigger;
+   unsigned int pckd_trgr_upper_bound = card-ext_csd.max_packed_writes;
+
+   /* scale down the upper bound to 75% */
+   pckd_trgr_upper_bound = (pckd_trgr_upper_bound * 3) / 4;
+
+   /*
+* since the most common calls for this function are with small
+* potential write values and since we don't want these calls to affect
+* the packed trigger, set a lower bound and ignore calls with
+* potential lower than that bound
+*/
+   if (potential = PCKD_TRGR_POTEN_LOWER_BOUND)
+   return trigger;
+
+   /*
+* this is to prevent integer overflow in the following calculation:
+* once every PACKED_TRIGGER_MAX_ELEMENTS reset the algorithm
+*/
+   if (num_mean_elements  PACKED_TRIGGER_MAX_ELEMENTS) {
+   num_mean_elements = 1;
+   mean_potential = PCKD_TRGR_INIT_MEAN_POTEN;
+   }
+
+   /*
+* get next mean value based on previous mean value and current
+* potential packed writes. Calculation is as follows:
+* mean_pot[i+1] =
+*  ((mean_pot[i] * num_mean_elem) + potential)/(num_mean_elem + 1)
+*/
+   mean_potential *= num_mean_elements;
+   /*
+* add num_mean_elements so that the division of two integers doesn't
+* lower mean_potential too much
+*/
+   if (potential  mean_potential)
+   mean_potential += num_mean_elements;
+   mean_potential += potential;
+   /* this is for gaining more precision when dividing two integers */
+   mean_potential *= PCKD_TRGR_PRECISION_MULTIPLIER;
+   /* this completes the mean calculation */
+   mean_potential /= ++num_mean_elements;
+   mean_potential /= PCKD_TRGR_PRECISION_MULTIPLIER;
+
+   /*
+* if current potential