Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
On 10 August 2016 at 15:41, David Milburnwrote: > Hi, > > The 168 makes AHCI_CMD_TBL_SZ equal to 2816 > > AHCI_CMD_TBL_SZ = AHCI_CMD_TBL_HDR_SZ + (AHCI_MAX_SG * 16) > AHCI_CMD_TBL_SZ = 128 + (168 * 16) > > I think if you add in AHCI_CMD_SLOT_SZ (1024) and AHCI_RX_FIS_SZ (256) > the DMA is 4K aligned, I think that is where the 168 came from. Looks like the right guess. Though AHCI_PORT_PRIV_DMA_SZ is not: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_SZ (2816) + AHCI_RX_FIS_SZ (256) = 4096 but: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ (256) = 91392 and AHCI_PORT_PRIV_FBS_DMA_SZ is: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ * 16 (4096) = 95232 > > Thanks, > David > >
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
On 10 August 2016 at 15:41, David Milburn wrote: > Hi, > > The 168 makes AHCI_CMD_TBL_SZ equal to 2816 > > AHCI_CMD_TBL_SZ = AHCI_CMD_TBL_HDR_SZ + (AHCI_MAX_SG * 16) > AHCI_CMD_TBL_SZ = 128 + (168 * 16) > > I think if you add in AHCI_CMD_SLOT_SZ (1024) and AHCI_RX_FIS_SZ (256) > the DMA is 4K aligned, I think that is where the 168 came from. Looks like the right guess. Though AHCI_PORT_PRIV_DMA_SZ is not: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_SZ (2816) + AHCI_RX_FIS_SZ (256) = 4096 but: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ (256) = 91392 and AHCI_PORT_PRIV_FBS_DMA_SZ is: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ * 16 (4096) = 95232 > > Thanks, > David > >
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
On 08/10/2016 12:19 PM, Tom Yan wrote: On 10 August 2016 at 15:41, David Milburnwrote: Hi, The 168 makes AHCI_CMD_TBL_SZ equal to 2816 AHCI_CMD_TBL_SZ = AHCI_CMD_TBL_HDR_SZ + (AHCI_MAX_SG * 16) AHCI_CMD_TBL_SZ = 128 + (168 * 16) I think if you add in AHCI_CMD_SLOT_SZ (1024) and AHCI_RX_FIS_SZ (256) the DMA is 4K aligned, I think that is where the 168 came from. Looks like the right guess. Though AHCI_PORT_PRIV_DMA_SZ is not: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_SZ (2816) + AHCI_RX_FIS_SZ (256) = 4096 but: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ (256) = 91392 and AHCI_PORT_PRIV_FBS_DMA_SZ is: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ * 16 (4096) = 95232 Yes, but in both cases mem_dma gets adjusted for AHCI_CMD_SLOT_SZ (1024) and rx_fis_sz (256 or 4096 in fbs case). Thanks, David
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
On 08/10/2016 12:19 PM, Tom Yan wrote: On 10 August 2016 at 15:41, David Milburn wrote: Hi, The 168 makes AHCI_CMD_TBL_SZ equal to 2816 AHCI_CMD_TBL_SZ = AHCI_CMD_TBL_HDR_SZ + (AHCI_MAX_SG * 16) AHCI_CMD_TBL_SZ = 128 + (168 * 16) I think if you add in AHCI_CMD_SLOT_SZ (1024) and AHCI_RX_FIS_SZ (256) the DMA is 4K aligned, I think that is where the 168 came from. Looks like the right guess. Though AHCI_PORT_PRIV_DMA_SZ is not: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_SZ (2816) + AHCI_RX_FIS_SZ (256) = 4096 but: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ (256) = 91392 and AHCI_PORT_PRIV_FBS_DMA_SZ is: AHCI_CMD_SLOT_SZ (1024) + AHCI_CMD_TBL_AR_SZ (2816 * 32 = 90112) + AHCI_RX_FIS_SZ * 16 (4096) = 95232 Yes, but in both cases mem_dma gets adjusted for AHCI_CMD_SLOT_SZ (1024) and rx_fis_sz (256 or 4096 in fbs case). Thanks, David
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
On 10 August 2016 at 11:26, Tejun Heowrote: > Hmmm.. why not? The hardware limit is 64k and the driver is using a Is that referring to the maximum number of entries allowed in the PRDT, Physical Region Descriptor Table (which is, more precisely, 65535)? > lower limit of 168 most likely because it doesn't make noticeable > difference beyond certain point and it determines the size of > contiguous memory which has to be allocated for the command table. > Each sg entry is 16 bytes. Pushing it to the hardware limit would > require an order 9 allocation for each port. That makes sense to me, and I didn't have the intention to push it to the limit anyway. > Not necessarily. A single sg entry can point to an area larger than > PAGE_SIZE. You mean the 4MB limit of "Data Byte Count" in "DW3: Description Information" of the PRDT? Is that what max_segment_size (which is set to a general fallback of 65536: http://lxr.free-electrons.com/ident?i=dma_get_max_seg_size) is about in this case? And my point was, it will be a multiple of 168 anyway, if 1344 is just an example. > As written above, that probably makes the ahci command table size > nicely aligned. I think that's what bothers me ultimately, cause I don't see how 168 makes it (more) nicely aligned (or even, aligned to what?). I even checked out the AHCI driver of FreeBSD (https://github.com/freebsd/freebsd/blob/master/sys/dev/ahci/ahci.h): ... #define MAXPHYS 512 * 1024 ... #define AHCI_SG_ENTRIES (roundup(btoc(MAXPHYS) + 1, 8)) ... #define AHCI_CT_SIZE (128 + AHCI_SG_ENTRIES * 16) ... Couldn't get the sense out of the `+ 1` and round up to 8 thing either.
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
On 10 August 2016 at 11:26, Tejun Heo wrote: > Hmmm.. why not? The hardware limit is 64k and the driver is using a Is that referring to the maximum number of entries allowed in the PRDT, Physical Region Descriptor Table (which is, more precisely, 65535)? > lower limit of 168 most likely because it doesn't make noticeable > difference beyond certain point and it determines the size of > contiguous memory which has to be allocated for the command table. > Each sg entry is 16 bytes. Pushing it to the hardware limit would > require an order 9 allocation for each port. That makes sense to me, and I didn't have the intention to push it to the limit anyway. > Not necessarily. A single sg entry can point to an area larger than > PAGE_SIZE. You mean the 4MB limit of "Data Byte Count" in "DW3: Description Information" of the PRDT? Is that what max_segment_size (which is set to a general fallback of 65536: http://lxr.free-electrons.com/ident?i=dma_get_max_seg_size) is about in this case? And my point was, it will be a multiple of 168 anyway, if 1344 is just an example. > As written above, that probably makes the ahci command table size > nicely aligned. I think that's what bothers me ultimately, cause I don't see how 168 makes it (more) nicely aligned (or even, aligned to what?). I even checked out the AHCI driver of FreeBSD (https://github.com/freebsd/freebsd/blob/master/sys/dev/ahci/ahci.h): ... #define MAXPHYS 512 * 1024 ... #define AHCI_SG_ENTRIES (roundup(btoc(MAXPHYS) + 1, 8)) ... #define AHCI_CT_SIZE (128 + AHCI_SG_ENTRIES * 16) ... Couldn't get the sense out of the `+ 1` and round up to 8 thing either.
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
Hello, Tom. On Wed, Aug 10, 2016 at 06:04:10PM +0800, Tom Yan wrote: > On 10 August 2016 at 11:26, Tejun Heowrote: > > Hmmm.. why not? The hardware limit is 64k and the driver is using a > > Is that referring to the maximum number of entries allowed in the > PRDT, Physical Region Descriptor Table (which is, more precisely, > 65535)? Yeap. > > Not necessarily. A single sg entry can point to an area larger than > > PAGE_SIZE. > > You mean the 4MB limit of "Data Byte Count" in "DW3: Description > Information" of the PRDT? Is that what max_segment_size (which is set > to a general fallback of 65536: > http://lxr.free-electrons.com/ident?i=dma_get_max_seg_size) is about > in this case? Ah, ahci isn't setting the hardware limit properly but yeah that's the maximum segment size. > And my point was, it will be a multiple of 168 anyway, if 1344 is just > an example. > > > As written above, that probably makes the ahci command table size > > nicely aligned. > > I think that's what bothers me ultimately, cause I don't see how 168 > makes it (more) nicely aligned (or even, aligned to what?). Hmmm... Looked at the sizes and they don't seem to align to anything meaningful. No idea. Thanks. -- tejun
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
Hello, Tom. On Wed, Aug 10, 2016 at 06:04:10PM +0800, Tom Yan wrote: > On 10 August 2016 at 11:26, Tejun Heo wrote: > > Hmmm.. why not? The hardware limit is 64k and the driver is using a > > Is that referring to the maximum number of entries allowed in the > PRDT, Physical Region Descriptor Table (which is, more precisely, > 65535)? Yeap. > > Not necessarily. A single sg entry can point to an area larger than > > PAGE_SIZE. > > You mean the 4MB limit of "Data Byte Count" in "DW3: Description > Information" of the PRDT? Is that what max_segment_size (which is set > to a general fallback of 65536: > http://lxr.free-electrons.com/ident?i=dma_get_max_seg_size) is about > in this case? Ah, ahci isn't setting the hardware limit properly but yeah that's the maximum segment size. > And my point was, it will be a multiple of 168 anyway, if 1344 is just > an example. > > > As written above, that probably makes the ahci command table size > > nicely aligned. > > I think that's what bothers me ultimately, cause I don't see how 168 > makes it (more) nicely aligned (or even, aligned to what?). Hmmm... Looked at the sizes and they don't seem to align to anything meaningful. No idea. Thanks. -- tejun
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
Hi, On 08/10/2016 10:14 AM, Tejun Heo wrote: Hello, Tom. On Wed, Aug 10, 2016 at 06:04:10PM +0800, Tom Yan wrote: On 10 August 2016 at 11:26, Tejun Heowrote: Hmmm.. why not? The hardware limit is 64k and the driver is using a Is that referring to the maximum number of entries allowed in the PRDT, Physical Region Descriptor Table (which is, more precisely, 65535)? Yeap. Not necessarily. A single sg entry can point to an area larger than PAGE_SIZE. You mean the 4MB limit of "Data Byte Count" in "DW3: Description Information" of the PRDT? Is that what max_segment_size (which is set to a general fallback of 65536: http://lxr.free-electrons.com/ident?i=dma_get_max_seg_size) is about in this case? Ah, ahci isn't setting the hardware limit properly but yeah that's the maximum segment size. And my point was, it will be a multiple of 168 anyway, if 1344 is just an example. As written above, that probably makes the ahci command table size nicely aligned. I think that's what bothers me ultimately, cause I don't see how 168 makes it (more) nicely aligned (or even, aligned to what?). Hmmm... Looked at the sizes and they don't seem to align to anything meaningful. No idea. The 168 makes AHCI_CMD_TBL_SZ equal to 2816 AHCI_CMD_TBL_SZ = AHCI_CMD_TBL_HDR_SZ + (AHCI_MAX_SG * 16) AHCI_CMD_TBL_SZ = 128 + (168 * 16) I think if you add in AHCI_CMD_SLOT_SZ (1024) and AHCI_RX_FIS_SZ (256) the DMA is 4K aligned, I think that is where the 168 came from. Thanks, David
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
Hi, On 08/10/2016 10:14 AM, Tejun Heo wrote: Hello, Tom. On Wed, Aug 10, 2016 at 06:04:10PM +0800, Tom Yan wrote: On 10 August 2016 at 11:26, Tejun Heo wrote: Hmmm.. why not? The hardware limit is 64k and the driver is using a Is that referring to the maximum number of entries allowed in the PRDT, Physical Region Descriptor Table (which is, more precisely, 65535)? Yeap. Not necessarily. A single sg entry can point to an area larger than PAGE_SIZE. You mean the 4MB limit of "Data Byte Count" in "DW3: Description Information" of the PRDT? Is that what max_segment_size (which is set to a general fallback of 65536: http://lxr.free-electrons.com/ident?i=dma_get_max_seg_size) is about in this case? Ah, ahci isn't setting the hardware limit properly but yeah that's the maximum segment size. And my point was, it will be a multiple of 168 anyway, if 1344 is just an example. As written above, that probably makes the ahci command table size nicely aligned. I think that's what bothers me ultimately, cause I don't see how 168 makes it (more) nicely aligned (or even, aligned to what?). Hmmm... Looked at the sizes and they don't seem to align to anything meaningful. No idea. The 168 makes AHCI_CMD_TBL_SZ equal to 2816 AHCI_CMD_TBL_SZ = AHCI_CMD_TBL_HDR_SZ + (AHCI_MAX_SG * 16) AHCI_CMD_TBL_SZ = 128 + (168 * 16) I think if you add in AHCI_CMD_SLOT_SZ (1024) and AHCI_RX_FIS_SZ (256) the DMA is 4K aligned, I think that is where the 168 came from. Thanks, David
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
Hello, Tom. On Sun, Aug 07, 2016 at 10:10:17PM +0800, Tom Yan wrote: > So the (not so) recent bump of BLK_DEF_MAX_SECTORS from 1024 to 2560 > (commit d2be537c3ba3) seemed to have caused trouble to some of the ATA > devices, which were then worked around with ATA_HORKAGE_MAX_SEC_1024. > > However, I am suspecting that the bump of BLK_DEF_MAX_SECTORS is not > the "real" cause of the trouble, but the fact that AHCI_MAX_SG has > been set to a weird value of 168 (with a comment "hardware max is > 64K", which neither seem to make any sense). Hmmm.. why not? The hardware limit is 64k and the driver is using a lower limit of 168 most likely because it doesn't make noticeable difference beyond certain point and it determines the size of contiguous memory which has to be allocated for the command table. Each sg entry is 16 bytes. Pushing it to the hardware limit would require an order 9 allocation for each port. > AHCI_MAX_SG is used to set the sg_tablesize (i.e. max_segments, > apparently), which is apparently used to derive the actual "request > size" (that is, if it is lower than max_sectors(_kb), it will be the > limiting factor instead). > > For example, no matter if the drive has max_sectors set to 2560, or to > 65535 (by adding it as the Optimal Transfer Length to libata's SATL, > which is also max_hw_sectors that is set from ATA_MAX_SECTORS_LBA48), > "avgrq-sz" in `iostat` will be capped at 1344 (168 * 8). Not necessarily. A single sg entry can point to an area larger than PAGE_SIZE. > However, if I change AHCI_MAX_SG to 128 (which is also the > sg_tablesize set in libata.h from LIBATA_MAX_PRD), "avgrq-sz" in > `iostat` will be capped at 1024 (128 * 8), which should make > ATA_HORKAGE_MAX_SEC_1024 unnecessary. > > So why has AHCI_MAX_SG been set to 168 anyway? As written above, that probably makes the ahci command table size nicely aligned. Thanks. -- tejun
Re: Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
Hello, Tom. On Sun, Aug 07, 2016 at 10:10:17PM +0800, Tom Yan wrote: > So the (not so) recent bump of BLK_DEF_MAX_SECTORS from 1024 to 2560 > (commit d2be537c3ba3) seemed to have caused trouble to some of the ATA > devices, which were then worked around with ATA_HORKAGE_MAX_SEC_1024. > > However, I am suspecting that the bump of BLK_DEF_MAX_SECTORS is not > the "real" cause of the trouble, but the fact that AHCI_MAX_SG has > been set to a weird value of 168 (with a comment "hardware max is > 64K", which neither seem to make any sense). Hmmm.. why not? The hardware limit is 64k and the driver is using a lower limit of 168 most likely because it doesn't make noticeable difference beyond certain point and it determines the size of contiguous memory which has to be allocated for the command table. Each sg entry is 16 bytes. Pushing it to the hardware limit would require an order 9 allocation for each port. > AHCI_MAX_SG is used to set the sg_tablesize (i.e. max_segments, > apparently), which is apparently used to derive the actual "request > size" (that is, if it is lower than max_sectors(_kb), it will be the > limiting factor instead). > > For example, no matter if the drive has max_sectors set to 2560, or to > 65535 (by adding it as the Optimal Transfer Length to libata's SATL, > which is also max_hw_sectors that is set from ATA_MAX_SECTORS_LBA48), > "avgrq-sz" in `iostat` will be capped at 1344 (168 * 8). Not necessarily. A single sg entry can point to an area larger than PAGE_SIZE. > However, if I change AHCI_MAX_SG to 128 (which is also the > sg_tablesize set in libata.h from LIBATA_MAX_PRD), "avgrq-sz" in > `iostat` will be capped at 1024 (128 * 8), which should make > ATA_HORKAGE_MAX_SEC_1024 unnecessary. > > So why has AHCI_MAX_SG been set to 168 anyway? As written above, that probably makes the ahci command table size nicely aligned. Thanks. -- tejun
Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
So the (not so) recent bump of BLK_DEF_MAX_SECTORS from 1024 to 2560 (commit d2be537c3ba3) seemed to have caused trouble to some of the ATA devices, which were then worked around with ATA_HORKAGE_MAX_SEC_1024. However, I am suspecting that the bump of BLK_DEF_MAX_SECTORS is not the "real" cause of the trouble, but the fact that AHCI_MAX_SG has been set to a weird value of 168 (with a comment "hardware max is 64K", which neither seem to make any sense). AHCI_MAX_SG is used to set the sg_tablesize (i.e. max_segments, apparently), which is apparently used to derive the actual "request size" (that is, if it is lower than max_sectors(_kb), it will be the limiting factor instead). For example, no matter if the drive has max_sectors set to 2560, or to 65535 (by adding it as the Optimal Transfer Length to libata's SATL, which is also max_hw_sectors that is set from ATA_MAX_SECTORS_LBA48), "avgrq-sz" in `iostat` will be capped at 1344 (168 * 8). However, if I change AHCI_MAX_SG to 128 (which is also the sg_tablesize set in libata.h from LIBATA_MAX_PRD), "avgrq-sz" in `iostat` will be capped at 1024 (128 * 8), which should make ATA_HORKAGE_MAX_SEC_1024 unnecessary. So why has AHCI_MAX_SG been set to 168 anyway?
Regarding AHCI_MAX_SG and (ATA_HORKAGE_MAX_SEC_1024)
So the (not so) recent bump of BLK_DEF_MAX_SECTORS from 1024 to 2560 (commit d2be537c3ba3) seemed to have caused trouble to some of the ATA devices, which were then worked around with ATA_HORKAGE_MAX_SEC_1024. However, I am suspecting that the bump of BLK_DEF_MAX_SECTORS is not the "real" cause of the trouble, but the fact that AHCI_MAX_SG has been set to a weird value of 168 (with a comment "hardware max is 64K", which neither seem to make any sense). AHCI_MAX_SG is used to set the sg_tablesize (i.e. max_segments, apparently), which is apparently used to derive the actual "request size" (that is, if it is lower than max_sectors(_kb), it will be the limiting factor instead). For example, no matter if the drive has max_sectors set to 2560, or to 65535 (by adding it as the Optimal Transfer Length to libata's SATL, which is also max_hw_sectors that is set from ATA_MAX_SECTORS_LBA48), "avgrq-sz" in `iostat` will be capped at 1344 (168 * 8). However, if I change AHCI_MAX_SG to 128 (which is also the sg_tablesize set in libata.h from LIBATA_MAX_PRD), "avgrq-sz" in `iostat` will be capped at 1024 (128 * 8), which should make ATA_HORKAGE_MAX_SEC_1024 unnecessary. So why has AHCI_MAX_SG been set to 168 anyway?