Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Keith Busch
On Wed, Aug 03, 2022 at 09:46:05AM +0800, Jinhao Fan wrote:
> at 4:54 PM, Klaus Jensen  wrote:
> 
> > I am unsure if the compiler will transform that division into the shift
> > if it can infer that the divisor is a power of two (it most likely
> > will be able to).
> > 
> > But I see no reason to have a potential division here when we can do
> > without and to me it is just as readable when you know the definition of
> > DSTRD is `2 ^ (2 + DSTRD)`.
> 
> OK. I will send a new patch with shifts instead of divisions. BTW, why do we
> want to avoid divisions?

Integer division is at least an order of magnitude more CPU cycles than a
shift. Some archs are worse than others, but historically we go out of the way
to avoid them in a hot path, so shifting is a more familiar coding pattern.

Compilers typically implement division as a shift if you're dividing by a a
power of two integer constant expression (ICE).

This example here isn't an ICE, but it is a shifted constant power-of-two. I
wrote up a simple test to see what my compiler does with that, and it looks
like gcc will properly optimize it, but only if compiled with '-O3'.



Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Jinhao Fan
at 4:54 PM, Klaus Jensen  wrote:

> I am unsure if the compiler will transform that division into the shift
> if it can infer that the divisor is a power of two (it most likely
> will be able to).
> 
> But I see no reason to have a potential division here when we can do
> without and to me it is just as readable when you know the definition of
> DSTRD is `2 ^ (2 + DSTRD)`.

OK. I will send a new patch with shifts instead of divisions. BTW, why do we
want to avoid divisions?




Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Klaus Jensen
On Aug  2 10:54, Klaus Jensen wrote:
> On Aug  2 16:31, Jinhao Fan wrote:
> > at 2:02 PM, Klaus Jensen  wrote:
> > 
> > > On Jul 28 16:07, Jinhao Fan wrote:
> > >> With the introduction of shadow doorbell and ioeventfd, we need to do
> > >> frequent conversion between qid and its doorbell offset. The original
> > >> hard-coded calculation is confusing and error-prone. Add several helper
> > >> functions to do this task.
> > >> 
> > >> Signed-off-by: Jinhao Fan 
> > >> ---
> > >> hw/nvme/ctrl.c | 61 --
> > >> 1 file changed, 39 insertions(+), 22 deletions(-)
> > >> 
> > >> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> > >> index 533ad14e7a..6116c0e660 100644
> > >> --- a/hw/nvme/ctrl.c
> > >> +++ b/hw/nvme/ctrl.c
> > >> @@ -487,6 +487,29 @@ static int nvme_check_cqid(NvmeCtrl *n, uint16_t 
> > >> cqid)
> > >> {
> > >> return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
> > >> }
> > >> +static inline bool nvme_db_offset_is_cq(NvmeCtrl *n, hwaddr offset)
> > >> +{
> > >> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> > >> +return (offset / stride) & 1;
> > >> +}
> > > 
> > > This can be changed morphed into `(offset >> (2 + dstrd)) & 1` if I am not
> > > mistaken.
> > > 
> > 
> > Yes. But my current code looks more readable to me. Is it necessary to
> > change to `(offset >> (2 + dstrd)) & 1`.
> > 
> 
> I am unsure if the compiler will transform that division into the shift
> if it can infer that the divisor is a power of two (it most likely
> will be able to).
> 
> But I see no reason to have a potential division here when we can do
> without and to me it is just as readable when you know the definition of
> DSTRD is `2 ^ (2 + DSTRD)`.
> 
> > >> +
> > >> +static inline uint16_t nvme_db_offset_to_qid(NvmeCtrl *n, hwaddr offset)
> > >> +{
> > >> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> > >> +return offset / (2 * stride);
> > >> +}
> > > 
> > > Same, should be able to do `offset >> (2 * dstrd + 1)`, no?
> > 
> > Same as above.
> > 
> 

I meant `offset >> (2 + dstrd + 1)` ('+', not '*') like above of course.


signature.asc
Description: PGP signature


Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Jinhao Fan
at 2:02 PM, Klaus Jensen  wrote:

> On Jul 28 16:07, Jinhao Fan wrote:
>> With the introduction of shadow doorbell and ioeventfd, we need to do
>> frequent conversion between qid and its doorbell offset. The original
>> hard-coded calculation is confusing and error-prone. Add several helper
>> functions to do this task.
>> 
>> Signed-off-by: Jinhao Fan 
>> ---
>> hw/nvme/ctrl.c | 61 --
>> 1 file changed, 39 insertions(+), 22 deletions(-)
>> 
>> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
>> index 533ad14e7a..6116c0e660 100644
>> --- a/hw/nvme/ctrl.c
>> +++ b/hw/nvme/ctrl.c
>> @@ -487,6 +487,29 @@ static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
>> {
>> return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
>> }
>> +static inline bool nvme_db_offset_is_cq(NvmeCtrl *n, hwaddr offset)
>> +{
>> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
>> +return (offset / stride) & 1;
>> +}
> 
> This can be changed morphed into `(offset >> (2 + dstrd)) & 1` if I am not
> mistaken.
> 

Yes. But my current code looks more readable to me. Is it necessary to
change to `(offset >> (2 + dstrd)) & 1`.

>> +
>> +static inline uint16_t nvme_db_offset_to_qid(NvmeCtrl *n, hwaddr offset)
>> +{
>> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
>> +return offset / (2 * stride);
>> +}
> 
> Same, should be able to do `offset >> (2 * dstrd + 1)`, no?

Same as above.




Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Klaus Jensen
On Aug  2 16:31, Jinhao Fan wrote:
> at 2:02 PM, Klaus Jensen  wrote:
> 
> > On Jul 28 16:07, Jinhao Fan wrote:
> >> With the introduction of shadow doorbell and ioeventfd, we need to do
> >> frequent conversion between qid and its doorbell offset. The original
> >> hard-coded calculation is confusing and error-prone. Add several helper
> >> functions to do this task.
> >> 
> >> Signed-off-by: Jinhao Fan 
> >> ---
> >> hw/nvme/ctrl.c | 61 --
> >> 1 file changed, 39 insertions(+), 22 deletions(-)
> >> 
> >> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> >> index 533ad14e7a..6116c0e660 100644
> >> --- a/hw/nvme/ctrl.c
> >> +++ b/hw/nvme/ctrl.c
> >> @@ -487,6 +487,29 @@ static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
> >> {
> >> return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
> >> }
> >> +static inline bool nvme_db_offset_is_cq(NvmeCtrl *n, hwaddr offset)
> >> +{
> >> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> >> +return (offset / stride) & 1;
> >> +}
> > 
> > This can be changed morphed into `(offset >> (2 + dstrd)) & 1` if I am not
> > mistaken.
> > 
> 
> Yes. But my current code looks more readable to me. Is it necessary to
> change to `(offset >> (2 + dstrd)) & 1`.
> 

I am unsure if the compiler will transform that division into the shift
if it can infer that the divisor is a power of two (it most likely
will be able to).

But I see no reason to have a potential division here when we can do
without and to me it is just as readable when you know the definition of
DSTRD is `2 ^ (2 + DSTRD)`.

> >> +
> >> +static inline uint16_t nvme_db_offset_to_qid(NvmeCtrl *n, hwaddr offset)
> >> +{
> >> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> >> +return offset / (2 * stride);
> >> +}
> > 
> > Same, should be able to do `offset >> (2 * dstrd + 1)`, no?
> 
> Same as above.
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.


signature.asc
Description: PGP signature


Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-02 Thread Klaus Jensen
On Jul 28 16:07, Jinhao Fan wrote:
> With the introduction of shadow doorbell and ioeventfd, we need to do
> frequent conversion between qid and its doorbell offset. The original
> hard-coded calculation is confusing and error-prone. Add several helper
> functions to do this task.
> 
> Signed-off-by: Jinhao Fan 
> ---
>  hw/nvme/ctrl.c | 61 --
>  1 file changed, 39 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 533ad14e7a..6116c0e660 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -487,6 +487,29 @@ static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
>  {
>  return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
>  }
> +static inline bool nvme_db_offset_is_cq(NvmeCtrl *n, hwaddr offset)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return (offset / stride) & 1;
> +}

This can be changed morphed into `(offset >> (2 + dstrd)) & 1` if I am not
mistaken.


> +
> +static inline uint16_t nvme_db_offset_to_qid(NvmeCtrl *n, hwaddr offset)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return offset / (2 * stride);
> +}

Same, should be able to do `offset >> (2 * dstrd + 1)`, no?

> +
> +static inline hwaddr nvme_cqid_to_db_offset(NvmeCtrl *n, uint16_t cqid)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return stride * (cqid * 2 + 1);
> +}
> +
> +static inline hwaddr nvme_sqid_to_db_offset(NvmeCtrl *n, uint16_t sqid)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return stride * sqid * 2;
> +}
>  
>  static void nvme_inc_cq_tail(NvmeCQueue *cq)
>  {
> @@ -4256,7 +4279,7 @@ static void nvme_cq_notifier(EventNotifier *e)
>  static int nvme_init_cq_ioeventfd(NvmeCQueue *cq)
>  {
>  NvmeCtrl *n = cq->ctrl;
> -uint16_t offset = (cq->cqid << 3) + (1 << 2);
> +uint16_t offset = nvme_cqid_to_db_offset(n, cq->cqid);
>  int ret;
>  
>  ret = event_notifier_init(>notifier, 0);
> @@ -4283,7 +4306,7 @@ static void nvme_sq_notifier(EventNotifier *e)
>  static int nvme_init_sq_ioeventfd(NvmeSQueue *sq)
>  {
>  NvmeCtrl *n = sq->ctrl;
> -uint16_t offset = sq->sqid << 3;
> +uint16_t offset = nvme_sqid_to_db_offset(n, sq->sqid);
>  int ret;
>  
>  ret = event_notifier_init(>notifier, 0);
> @@ -4300,7 +4323,7 @@ static int nvme_init_sq_ioeventfd(NvmeSQueue *sq)
>  
>  static void nvme_free_sq(NvmeSQueue *sq, NvmeCtrl *n)
>  {
> -uint16_t offset = sq->sqid << 3;
> +uint16_t offset = nvme_sqid_to_db_offset(n, sq->sqid);
>  
>  n->sq[sq->sqid] = NULL;
>  timer_free(sq->timer);
> @@ -4379,8 +4402,8 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, 
> uint64_t dma_addr,
>  sq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq);
>  
>  if (n->dbbuf_enabled) {
> -sq->db_addr = n->dbbuf_dbs + (sqid << 3);
> -sq->ei_addr = n->dbbuf_eis + (sqid << 3);
> +sq->db_addr = n->dbbuf_dbs + nvme_sqid_to_db_offset(n, sqid);
> +sq->ei_addr = n->dbbuf_eis + nvme_sqid_to_db_offset(n, sqid);
>  
>  if (n->params.ioeventfd && sq->sqid != 0) {
>  if (!nvme_init_sq_ioeventfd(sq)) {
> @@ -4690,8 +4713,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
> *req)
>  
>  static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
>  {
> -uint16_t offset = (cq->cqid << 3) + (1 << 2);
> -
> +uint16_t offset = nvme_cqid_to_db_offset(n, cq->cqid);
> +
>  n->cq[cq->cqid] = NULL;
>  timer_free(cq->timer);
>  if (cq->ioeventfd_enabled) {
> @@ -4755,8 +4778,8 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, 
> uint64_t dma_addr,
>  QTAILQ_INIT(>req_list);
>  QTAILQ_INIT(>sq_list);
>  if (n->dbbuf_enabled) {
> -cq->db_addr = n->dbbuf_dbs + (cqid << 3) + (1 << 2);
> -cq->ei_addr = n->dbbuf_eis + (cqid << 3) + (1 << 2);
> +cq->db_addr = n->dbbuf_dbs + nvme_cqid_to_db_offset(n, cqid);
> +cq->ei_addr = n->dbbuf_eis + nvme_cqid_to_db_offset(n, cqid);
>  
>  if (n->params.ioeventfd && cqid != 0) {
>  if (!nvme_init_cq_ioeventfd(cq)) {
> @@ -6128,13 +6151,8 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const 
> NvmeRequest *req)
>  NvmeCQueue *cq = n->cq[i];
>  
>  if (sq) {
> -/*
> - * CAP.DSTRD is 0, so offset of ith sq db_addr is (i<<3)
> - * nvme_process_db() uses this hard-coded way to calculate
> - * doorbell offsets. Be consistent with that here.
> - */
> -sq->db_addr = dbs_addr + (i << 3);
> -sq->ei_addr = eis_addr + (i << 3);
> +sq->db_addr = dbs_addr + nvme_sqid_to_db_offset(n, i);
> +sq->ei_addr = eis_addr + nvme_sqid_to_db_offset(n, i);
>  pci_dma_write(>parent_obj, sq->db_addr, >tail,
>  sizeof(sq->tail));
>  
> @@ -6146,9 +6164,8 @@ static uint16_t 

Re: [PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-08-01 Thread Jinhao Fan
at 4:07 PM, Jinhao Fan  wrote:

> With the introduction of shadow doorbell and ioeventfd, we need to do
> frequent conversion between qid and its doorbell offset. The original
> hard-coded calculation is confusing and error-prone. Add several helper
> functions to do this task.
> 
> Signed-off-by: Jinhao Fan 
> ---
> hw/nvme/ctrl.c | 61 --
> 1 file changed, 39 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 533ad14e7a..6116c0e660 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -487,6 +487,29 @@ static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
> {
> return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
> }
> +static inline bool nvme_db_offset_is_cq(NvmeCtrl *n, hwaddr offset)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return (offset / stride) & 1;
> +}
> +
> +static inline uint16_t nvme_db_offset_to_qid(NvmeCtrl *n, hwaddr offset)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return offset / (2 * stride);
> +}
> +
> +static inline hwaddr nvme_cqid_to_db_offset(NvmeCtrl *n, uint16_t cqid)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return stride * (cqid * 2 + 1);
> +}
> +
> +static inline hwaddr nvme_sqid_to_db_offset(NvmeCtrl *n, uint16_t sqid)
> +{
> +hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
> +return stride * sqid * 2;
> +}
> 
> static void nvme_inc_cq_tail(NvmeCQueue *cq)
> {
> @@ -4256,7 +4279,7 @@ static void nvme_cq_notifier(EventNotifier *e)
> static int nvme_init_cq_ioeventfd(NvmeCQueue *cq)
> {
> NvmeCtrl *n = cq->ctrl;
> -uint16_t offset = (cq->cqid << 3) + (1 << 2);
> +uint16_t offset = nvme_cqid_to_db_offset(n, cq->cqid);
> int ret;
> 
> ret = event_notifier_init(>notifier, 0);
> @@ -4283,7 +4306,7 @@ static void nvme_sq_notifier(EventNotifier *e)
> static int nvme_init_sq_ioeventfd(NvmeSQueue *sq)
> {
> NvmeCtrl *n = sq->ctrl;
> -uint16_t offset = sq->sqid << 3;
> +uint16_t offset = nvme_sqid_to_db_offset(n, sq->sqid);
> int ret;
> 
> ret = event_notifier_init(>notifier, 0);
> @@ -4300,7 +4323,7 @@ static int nvme_init_sq_ioeventfd(NvmeSQueue *sq)
> 
> static void nvme_free_sq(NvmeSQueue *sq, NvmeCtrl *n)
> {
> -uint16_t offset = sq->sqid << 3;
> +uint16_t offset = nvme_sqid_to_db_offset(n, sq->sqid);
> 
> n->sq[sq->sqid] = NULL;
> timer_free(sq->timer);
> @@ -4379,8 +4402,8 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, 
> uint64_t dma_addr,
> sq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq);
> 
> if (n->dbbuf_enabled) {
> -sq->db_addr = n->dbbuf_dbs + (sqid << 3);
> -sq->ei_addr = n->dbbuf_eis + (sqid << 3);
> +sq->db_addr = n->dbbuf_dbs + nvme_sqid_to_db_offset(n, sqid);
> +sq->ei_addr = n->dbbuf_eis + nvme_sqid_to_db_offset(n, sqid);
> 
> if (n->params.ioeventfd && sq->sqid != 0) {
> if (!nvme_init_sq_ioeventfd(sq)) {
> @@ -4690,8 +4713,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
> *req)
> 
> static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
> {
> -uint16_t offset = (cq->cqid << 3) + (1 << 2);
> -
> +uint16_t offset = nvme_cqid_to_db_offset(n, cq->cqid);
> +
> n->cq[cq->cqid] = NULL;
> timer_free(cq->timer);
> if (cq->ioeventfd_enabled) {
> @@ -4755,8 +4778,8 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, 
> uint64_t dma_addr,
> QTAILQ_INIT(>req_list);
> QTAILQ_INIT(>sq_list);
> if (n->dbbuf_enabled) {
> -cq->db_addr = n->dbbuf_dbs + (cqid << 3) + (1 << 2);
> -cq->ei_addr = n->dbbuf_eis + (cqid << 3) + (1 << 2);
> +cq->db_addr = n->dbbuf_dbs + nvme_cqid_to_db_offset(n, cqid);
> +cq->ei_addr = n->dbbuf_eis + nvme_cqid_to_db_offset(n, cqid);
> 
> if (n->params.ioeventfd && cqid != 0) {
> if (!nvme_init_cq_ioeventfd(cq)) {
> @@ -6128,13 +6151,8 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const 
> NvmeRequest *req)
> NvmeCQueue *cq = n->cq[i];
> 
> if (sq) {
> -/*
> - * CAP.DSTRD is 0, so offset of ith sq db_addr is (i<<3)
> - * nvme_process_db() uses this hard-coded way to calculate
> - * doorbell offsets. Be consistent with that here.
> - */
> -sq->db_addr = dbs_addr + (i << 3);
> -sq->ei_addr = eis_addr + (i << 3);
> +sq->db_addr = dbs_addr + nvme_sqid_to_db_offset(n, i);
> +sq->ei_addr = eis_addr + nvme_sqid_to_db_offset(n, i);
> pci_dma_write(>parent_obj, sq->db_addr, >tail,
> sizeof(sq->tail));
> 
> @@ -6146,9 +6164,8 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const 
> NvmeRequest *req)
> }
> 
> if (cq) {
> -/* CAP.DSTRD is 0, so offset of ith cq db_addr is (i<<3)+(1<<2) 
> */
> -

[PATCH] hw/nvme: Add helper functions for qid-db conversion

2022-07-28 Thread Jinhao Fan
With the introduction of shadow doorbell and ioeventfd, we need to do
frequent conversion between qid and its doorbell offset. The original
hard-coded calculation is confusing and error-prone. Add several helper
functions to do this task.

Signed-off-by: Jinhao Fan 
---
 hw/nvme/ctrl.c | 61 --
 1 file changed, 39 insertions(+), 22 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 533ad14e7a..6116c0e660 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -487,6 +487,29 @@ static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
 {
 return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
 }
+static inline bool nvme_db_offset_is_cq(NvmeCtrl *n, hwaddr offset)
+{
+hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
+return (offset / stride) & 1;
+}
+
+static inline uint16_t nvme_db_offset_to_qid(NvmeCtrl *n, hwaddr offset)
+{
+hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
+return offset / (2 * stride);
+}
+
+static inline hwaddr nvme_cqid_to_db_offset(NvmeCtrl *n, uint16_t cqid)
+{
+hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
+return stride * (cqid * 2 + 1);
+}
+
+static inline hwaddr nvme_sqid_to_db_offset(NvmeCtrl *n, uint16_t sqid)
+{
+hwaddr stride = 4 << NVME_CAP_DSTRD(ldq_le_p(>bar.cap));
+return stride * sqid * 2;
+}
 
 static void nvme_inc_cq_tail(NvmeCQueue *cq)
 {
@@ -4256,7 +4279,7 @@ static void nvme_cq_notifier(EventNotifier *e)
 static int nvme_init_cq_ioeventfd(NvmeCQueue *cq)
 {
 NvmeCtrl *n = cq->ctrl;
-uint16_t offset = (cq->cqid << 3) + (1 << 2);
+uint16_t offset = nvme_cqid_to_db_offset(n, cq->cqid);
 int ret;
 
 ret = event_notifier_init(>notifier, 0);
@@ -4283,7 +4306,7 @@ static void nvme_sq_notifier(EventNotifier *e)
 static int nvme_init_sq_ioeventfd(NvmeSQueue *sq)
 {
 NvmeCtrl *n = sq->ctrl;
-uint16_t offset = sq->sqid << 3;
+uint16_t offset = nvme_sqid_to_db_offset(n, sq->sqid);
 int ret;
 
 ret = event_notifier_init(>notifier, 0);
@@ -4300,7 +4323,7 @@ static int nvme_init_sq_ioeventfd(NvmeSQueue *sq)
 
 static void nvme_free_sq(NvmeSQueue *sq, NvmeCtrl *n)
 {
-uint16_t offset = sq->sqid << 3;
+uint16_t offset = nvme_sqid_to_db_offset(n, sq->sqid);
 
 n->sq[sq->sqid] = NULL;
 timer_free(sq->timer);
@@ -4379,8 +4402,8 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, 
uint64_t dma_addr,
 sq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq);
 
 if (n->dbbuf_enabled) {
-sq->db_addr = n->dbbuf_dbs + (sqid << 3);
-sq->ei_addr = n->dbbuf_eis + (sqid << 3);
+sq->db_addr = n->dbbuf_dbs + nvme_sqid_to_db_offset(n, sqid);
+sq->ei_addr = n->dbbuf_eis + nvme_sqid_to_db_offset(n, sqid);
 
 if (n->params.ioeventfd && sq->sqid != 0) {
 if (!nvme_init_sq_ioeventfd(sq)) {
@@ -4690,8 +4713,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 
 static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
 {
-uint16_t offset = (cq->cqid << 3) + (1 << 2);
-
+uint16_t offset = nvme_cqid_to_db_offset(n, cq->cqid);
+
 n->cq[cq->cqid] = NULL;
 timer_free(cq->timer);
 if (cq->ioeventfd_enabled) {
@@ -4755,8 +4778,8 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, 
uint64_t dma_addr,
 QTAILQ_INIT(>req_list);
 QTAILQ_INIT(>sq_list);
 if (n->dbbuf_enabled) {
-cq->db_addr = n->dbbuf_dbs + (cqid << 3) + (1 << 2);
-cq->ei_addr = n->dbbuf_eis + (cqid << 3) + (1 << 2);
+cq->db_addr = n->dbbuf_dbs + nvme_cqid_to_db_offset(n, cqid);
+cq->ei_addr = n->dbbuf_eis + nvme_cqid_to_db_offset(n, cqid);
 
 if (n->params.ioeventfd && cqid != 0) {
 if (!nvme_init_cq_ioeventfd(cq)) {
@@ -6128,13 +6151,8 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const 
NvmeRequest *req)
 NvmeCQueue *cq = n->cq[i];
 
 if (sq) {
-/*
- * CAP.DSTRD is 0, so offset of ith sq db_addr is (i<<3)
- * nvme_process_db() uses this hard-coded way to calculate
- * doorbell offsets. Be consistent with that here.
- */
-sq->db_addr = dbs_addr + (i << 3);
-sq->ei_addr = eis_addr + (i << 3);
+sq->db_addr = dbs_addr + nvme_sqid_to_db_offset(n, i);
+sq->ei_addr = eis_addr + nvme_sqid_to_db_offset(n, i);
 pci_dma_write(>parent_obj, sq->db_addr, >tail,
 sizeof(sq->tail));
 
@@ -6146,9 +6164,8 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const 
NvmeRequest *req)
 }
 
 if (cq) {
-/* CAP.DSTRD is 0, so offset of ith cq db_addr is (i<<3)+(1<<2) */
-cq->db_addr = dbs_addr + (i << 3) + (1 << 2);
-cq->ei_addr = eis_addr + (i << 3) + (1 << 2);
+cq->db_addr = dbs_addr + nvme_cqid_to_db_offset(n, i);
+cq->ei_addr = eis_addr + nvme_cqid_to_db_offset(n, i);