Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-07 Thread Bill Pringlemeir
On  3 Apr 2015, scottw...@freescale.com wrote:

 On Fri, 2015-04-03 at 22:28 +0200, Stefan Agner wrote:

 Why not? IMHO, there are valid reason to do it, since we save coping
 data over the bus (we save copying page size - write len of bytes)...

 According to http://www.linux-mtd.infradead.org/doc/ubi.html#L_subpage
 the motivation for subpages is saving space, not time, and it's only
 used for headers (specifically because using subpages may be slower).
 So it may not make a huge performance difference either way, even if
 subpages are less efficient on this controller -- though it does have
 a complexity cost that is higher than with most controllers.  It looks
 like the space savings is around one page per block.

I think that Artem wrote this.  I don't believe the document is
completely correct; I think he meant that sub-pages is not architected
as a speed improvement.  For instance, UBI will read both a VID (volume
ID) and EB (erase block).  As the are combined into one, then if a
driver needs to read a full page (like it is the only thing it supports)
then it is simple to read a full page with a VID/EB sub-pages.
Certainly for mounts with out 'fastmap', this will result in much faster
scans and initial UBI/UbiFS mounts?

It also saves one page overhead per erase block.  So in all cases,
sub-pages will optimize flash usage.  Probably performance depends on
the MTD driver and CPU.

 It'd be good to benchmark up front to be sure you're happy with the
 speed/space/complexity tradeoff, though, since enabling/disabling
 subpage writes breaks UBI compatibility.

I think that if you format the UbiFs without sub-pages and then update
code to handle sub-pages, you are fine.  If you have flash/UBI formatted
with sub-pages and then you disable it in the code, the disk is no
longer mountable.

In any case the document has,

  If the NAND flash supports sub-pages, then what can be done is ECC
  codes can be calculated on per-sub-page basis, instead of per-NAND
  page basis. In this case it becomes possible to read and write
  sub-pages independently.

Probably if we want sub-pages with this controller and hw-ecc, we need
to use the virtual buffer features of the chip.  The controller needs an
entire current buffer in the SRAM to calculate the hw-ecc to write.  So
even if it writes less than a full page, the entire page must be read to
calculate the hw-ecc to be written.  I am pretty sure that all
controllers that support hw-ecc will need to do this.

Fwiw,
Bill Pringlemeir.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-07 Thread Scott Wood
On Tue, 2015-04-07 at 10:06 -0400, Bill Pringlemeir wrote:
 On  3 Apr 2015, scottw...@freescale.com wrote:
 
  On Fri, 2015-04-03 at 22:28 +0200, Stefan Agner wrote:
 
  Why not? IMHO, there are valid reason to do it, since we save coping
  data over the bus (we save copying page size - write len of bytes)...
 
  According to http://www.linux-mtd.infradead.org/doc/ubi.html#L_subpage
  the motivation for subpages is saving space, not time, and it's only
  used for headers (specifically because using subpages may be slower).
  So it may not make a huge performance difference either way, even if
  subpages are less efficient on this controller -- though it does have
  a complexity cost that is higher than with most controllers.  It looks
  like the space savings is around one page per block.
 
 I think that Artem wrote this.  I don't believe the document is
 completely correct; I think he meant that sub-pages is not architected
 as a speed improvement.  For instance, UBI will read both a VID (volume
 ID) and EB (erase block).  As the are combined into one, then if a
 driver needs to read a full page (like it is the only thing it supports)
 then it is simple to read a full page with a VID/EB sub-pages.
 Certainly for mounts with out 'fastmap', this will result in much faster
 scans and initial UBI/UbiFS mounts?
 
 It also saves one page overhead per erase block.  So in all cases,
 sub-pages will optimize flash usage.  Probably performance depends on
 the MTD driver and CPU.

Possibly.  Again, I recommend benchmarking before committing to it.

  It'd be good to benchmark up front to be sure you're happy with the
  speed/space/complexity tradeoff, though, since enabling/disabling
  subpage writes breaks UBI compatibility.
 
 I think that if you format the UbiFs without sub-pages and then update
 code to handle sub-pages, you are fine.  If you have flash/UBI formatted
 with sub-pages and then you disable it in the code, the disk is no
 longer mountable.

If that's the case, then that's even more reason to leave subpages
disabled until you're sure you want them.

 In any case the document has,
 
   If the NAND flash supports sub-pages, then what can be done is ECC
   codes can be calculated on per-sub-page basis, instead of per-NAND
   page basis. In this case it becomes possible to read and write
   sub-pages independently.
 
 Probably if we want sub-pages with this controller and hw-ecc, we need
 to use the virtual buffer features of the chip.  The controller needs an
 entire current buffer in the SRAM to calculate the hw-ecc to write.  So
 even if it writes less than a full page, the entire page must be read to
 calculate the hw-ecc to be written.

That limitation sounds similar to the Freescale NAND controllers that
I'm familiar with (eLBC and IFC).  For eLBC we do subpages by just
writing 0xff, because on that controller the ECC of all 0xff is all 0xff
so it doesn't disturb anything.  IFC has different ECC algorithms where
that isn't the case, so we disabled subpage write on IFC.

What is the virtual buffer feature?

 I am pretty sure that all controllers that support hw-ecc will need to do 
 this.

Not if they can handle doing ECC on a single subpage.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-07 Thread Bill Pringlemeir
On  7 Apr 2015, scottw...@freescale.com wrote:
 On Tue, 2015-04-07 at 10:06 -0400, Bill Pringlemeir wrote:

 In any case the document has,

 If the NAND flash supports sub-pages, then what can be done is ECC
 codes can be calculated on per-sub-page basis, instead of per-NAND
 page basis. In this case it becomes possible to read and write
 sub-pages independently.

 Probably if we want sub-pages with this controller and hw-ecc, we
 need to use the virtual buffer features of the chip.  The controller
 needs an entire current buffer in the SRAM to calculate the hw-ecc to
 write.  So even if it writes less than a full page, the entire page
 must be read to calculate the hw-ecc to be written.

 That limitation sounds similar to the Freescale NAND controllers that
 I'm familiar with (eLBC and IFC).  For eLBC we do subpages by just
 writing 0xff, because on that controller the ECC of all 0xff is all
 0xff so it doesn't disturb anything.  IFC has different ECC algorithms
 where that isn't the case, so we disabled subpage write on IFC.

 What is the virtual buffer feature?

 I am pretty sure that all controllers that support hw-ecc will need
 to do this.

 Not if they can handle doing ECC on a single subpage.

[from another thread, but the same subject].

 The other way to handle things would to be to investigate the
 NFC_CFG[PAGE_CNT] and NFC_SECSZ to have the virtual pages support
 sub-pages.  I think the OOB mapping would be non-standard in such
 cases.

 Wouldn't that mess up factory bad block markers?

All the stuff above is related (afaik).

   What is the virtual buffer feature?

It splits programming of a flash page into separate buffers.  The
BCH-HW-ECC is then applied separately to each 'virtual-page' with
reduced strength.  Section 31.4.6 Organization of the Data in the NAND
Flash of the Vybrid Software RM has details.

  Wouldn't that mess up factory bad block markers?

I am unsure; certainly they can be read but they might be a data portion
of the fourth sub-page depending on the ECC strength selected.  There is
also a 'NFC_SWAP' register to switch the position of one 16bit index
(move OOB-BBT 16bit from data to OOB).  I think this can be used.  By
non-standard, I meant not fitting the current drivers idea of OOB
layout.

However, it seems like your comment that the ECC must be different
per-subpage (and what Artem said in the sub-pages documentation) makes
'virtual buffers' the most promising path for this driver and sub-page
support with hw-ecc.  As the bit strength is reduced, the amount of
corrections/error detection also seems reduced.  I think the math would
make this the same for everyone?

Your question about factory BBT is a good point.  Using the 'virtual
buffers' feature will complicate the driver code by quite a bit at least
from what I could think of and what I see in the MTD tree where I
believe similar features are used.

Fwiw,
Bill Pringlemeir.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-07 Thread Scott Wood
On Tue, 2015-04-07 at 13:54 -0400, Bill Pringlemeir wrote:
 On  7 Apr 2015, scottw...@freescale.com wrote:
  On Tue, 2015-04-07 at 10:06 -0400, Bill Pringlemeir wrote:
 
  In any case the document has,
 
  If the NAND flash supports sub-pages, then what can be done is ECC
  codes can be calculated on per-sub-page basis, instead of per-NAND
  page basis. In this case it becomes possible to read and write
  sub-pages independently.
 
  Probably if we want sub-pages with this controller and hw-ecc, we
  need to use the virtual buffer features of the chip.  The controller
  needs an entire current buffer in the SRAM to calculate the hw-ecc to
  write.  So even if it writes less than a full page, the entire page
  must be read to calculate the hw-ecc to be written.
 
  That limitation sounds similar to the Freescale NAND controllers that
  I'm familiar with (eLBC and IFC).  For eLBC we do subpages by just
  writing 0xff, because on that controller the ECC of all 0xff is all
  0xff so it doesn't disturb anything.  IFC has different ECC algorithms
  where that isn't the case, so we disabled subpage write on IFC.
 
  What is the virtual buffer feature?
 
  I am pretty sure that all controllers that support hw-ecc will need
  to do this.
 
  Not if they can handle doing ECC on a single subpage.
 
 [from another thread, but the same subject].
 
  The other way to handle things would to be to investigate the
  NFC_CFG[PAGE_CNT] and NFC_SECSZ to have the virtual pages support
  sub-pages.  I think the OOB mapping would be non-standard in such
  cases.
 
  Wouldn't that mess up factory bad block markers?
 
 All the stuff above is related (afaik).
 
What is the virtual buffer feature?
 
 It splits programming of a flash page into separate buffers.  The
 BCH-HW-ECC is then applied separately to each 'virtual-page' with
 reduced strength.  Section 31.4.6 Organization of the Data in the NAND
 Flash of the Vybrid Software RM has details.
 
   Wouldn't that mess up factory bad block markers?
 
 I am unsure; certainly they can be read but they might be a data portion
 of the fourth sub-page depending on the ECC strength selected.

That would be bad.  Even if you somehow get the MTD code to read markers
in the main area on first use, you wouldn't be able to reconstruct the
BBT after you start writing data.  If you must do this, I suggest
migrating the bad block markers to the new OOB before usage (see
http://patchwork.ozlabs.org/patch/168855/ for an example).

 However, it seems like your comment that the ECC must be different
 per-subpage (and what Artem said in the sub-pages documentation) makes
 'virtual buffers' the most promising path for this driver and sub-page
 support with hw-ecc.  As the bit strength is reduced, the amount of
 corrections/error detection also seems reduced.  I think the math would
 make this the same for everyone?

Flash chips usually specify the required ECC on a subpage basis.

 Your question about factory BBT is a good point.  Using the 'virtual
 buffers' feature will complicate the driver code by quite a bit at least
 from what I could think of and what I see in the MTD tree where I
 believe similar features are used.

Which gets back to the question of whether subpages are worth it with
this controller.  Or, if subpage writes are infrequent enough, stick
with the read/modify/write approach but delay the read until you know
that it's going to be a subpage write.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-03 Thread Stefan Agner
On 2015-04-03 01:48, Scott Wood wrote:
 On Tue, 2015-03-31 at 11:02 -0400, Bill Pringlemeir wrote:
 On 2015-03-31 00:15, Scott Wood wrote:

  Especially since you'd be doing one write rather than four full-page
  partial writes.  Surely the bottleneck here is the NAND chip itself,
  not copying data to the buffer?

 The AHB bus that the NFC controller is on is relatively slow.  Here are
 some numbers from 'AN4947-vybrid-bus-architechure',

 Vybrid Cortex A5 to DDR (in CPU clocks 400/500MHz),

First read Subsequent
2858  all caches on
345269no cache, mmu
437371no cache, no mmu

 The NFC is on an AHB bus 32bit, 66MHz (not AXI 64bit, 133-166MHz like
 DDR).  The AHB will be about four times slower.  Also the reads and
 writes to the physical NAND must take place serially.  Here are the
 program page steps.

   1. Issue controller Read full page to NFC buffer.
   2. Copy update partial page from DDR to NFC buffer.
   3. Issue write NAND page.
 
 Why is any sort of read part of the write process?

To recalculate the correct ECC, which is done in the controller, the
controller has to have the page in the SRAM. I will send out a patch
which implements vf610_nfc_write_subpage. And the read part is done when
the MTD subsystem calls NAND_CMD_SEQIN.

Actually, the Linux NAND driver supports subpage writes already by using
the generic nand_write_subpage_hwecc function. However, in U-Boot I
added driver specific page_read/page_write due to performance reasons:
http://lists.denx.de/pipermail/u-boot/2014-August/186293.html

However, I tried driver specific page_read/page_write functions for
Linux too, but I couldn't measure noticeable performance improvements. I
think the reason why it lead to noticeable improvements for U-Boot was
because U-Boot does not use caches so far. We could also switch to the
framework functions again, but those are more complex than necessary.
Given that U-Boot uses device specific binaries anyway, there is no size
advantage by using the (shared) MTD subsystems functions.

--
Stefan
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-03 Thread Scott Wood
On Fri, 2015-04-03 at 22:28 +0200, Stefan Agner wrote:
 On 2015-04-03 22:15, Scott Wood wrote:
  On Fri, 2015-04-03 at 20:09 +0200, Stefan Agner wrote:
  On 2015-04-03 01:48, Scott Wood wrote:
   On Tue, 2015-03-31 at 11:02 -0400, Bill Pringlemeir wrote:
   On 2015-03-31 00:15, Scott Wood wrote:
  
Especially since you'd be doing one write rather than four full-page
partial writes.  Surely the bottleneck here is the NAND chip itself,
not copying data to the buffer?
  
   The AHB bus that the NFC controller is on is relatively slow.  Here are
   some numbers from 'AN4947-vybrid-bus-architechure',
  
   Vybrid Cortex A5 to DDR (in CPU clocks 400/500MHz),
  
  First read Subsequent
  2858  all caches on
  345269no cache, mmu
  437371no cache, no mmu
  
   The NFC is on an AHB bus 32bit, 66MHz (not AXI 64bit, 133-166MHz like
   DDR).  The AHB will be about four times slower.  Also the reads and
   writes to the physical NAND must take place serially.  Here are the
   program page steps.
  
 1. Issue controller Read full page to NFC buffer.
 2. Copy update partial page from DDR to NFC buffer.
 3. Issue write NAND page.
  
   Why is any sort of read part of the write process?
 
  To recalculate the correct ECC, which is done in the controller, the
  controller has to have the page in the SRAM. I will send out a patch
  which implements vf610_nfc_write_subpage. And the read part is done when
  the MTD subsystem calls NAND_CMD_SEQIN.
  
  Again, if this is the only way you can do subpage accesses then you
  should not do them.
 
 Why not? IMHO, there are valid reason to do it, since we save coping
 data over the bus (we save copying page size - write len of bytes)...

According to http://www.linux-mtd.infradead.org/doc/ubi.html#L_subpage
the motivation for subpages is saving space, not time, and it's only
used for headers (specifically because using subpages may be slower). 
So it may not make a huge performance difference either way, even if
subpages are less efficient on this controller -- though it does have a
complexity cost that is higher than with most controllers.  It looks
like the space savings is around one page per block.

It'd be good to benchmark up front to be sure you're happy with the
speed/space/complexity tradeoff, though, since enabling/disabling
subpage writes breaks UBI compatibility.

 Also, I guess all NAND controller which do HW ECC need to read at least
 ECC step size back to the controller... Maybe we can move the discussion
 to the actual code (see mtd: vf610_nfc: support subpage write).

No, most controller drivers do not read from the NAND chip during the
programming process.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-03 Thread Scott Wood
On Fri, 2015-04-03 at 20:09 +0200, Stefan Agner wrote:
 On 2015-04-03 01:48, Scott Wood wrote:
  On Tue, 2015-03-31 at 11:02 -0400, Bill Pringlemeir wrote:
  On 2015-03-31 00:15, Scott Wood wrote:
 
   Especially since you'd be doing one write rather than four full-page
   partial writes.  Surely the bottleneck here is the NAND chip itself,
   not copying data to the buffer?
 
  The AHB bus that the NFC controller is on is relatively slow.  Here are
  some numbers from 'AN4947-vybrid-bus-architechure',
 
  Vybrid Cortex A5 to DDR (in CPU clocks 400/500MHz),
 
 First read Subsequent
 2858  all caches on
 345269no cache, mmu
 437371no cache, no mmu
 
  The NFC is on an AHB bus 32bit, 66MHz (not AXI 64bit, 133-166MHz like
  DDR).  The AHB will be about four times slower.  Also the reads and
  writes to the physical NAND must take place serially.  Here are the
  program page steps.
 
1. Issue controller Read full page to NFC buffer.
2. Copy update partial page from DDR to NFC buffer.
3. Issue write NAND page.
  
  Why is any sort of read part of the write process?
 
 To recalculate the correct ECC, which is done in the controller, the
 controller has to have the page in the SRAM. I will send out a patch
 which implements vf610_nfc_write_subpage. And the read part is done when
 the MTD subsystem calls NAND_CMD_SEQIN.

Again, if this is the only way you can do subpage accesses then you
should not do them.

 Actually, the Linux NAND driver supports subpage writes already by using
 the generic nand_write_subpage_hwecc function. However, in U-Boot I
 added driver specific page_read/page_write due to performance reasons:
 http://lists.denx.de/pipermail/u-boot/2014-August/186293.html
 
 However, I tried driver specific page_read/page_write functions for
 Linux too, but I couldn't measure noticeable performance improvements. I
 think the reason why it lead to noticeable improvements for U-Boot was
 because U-Boot does not use caches so far.

If you care about U-Boot's performance at all, enabling cache would be
the first thing I'd try...

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-03 Thread Stefan Agner
On 2015-04-03 22:15, Scott Wood wrote:
 On Fri, 2015-04-03 at 20:09 +0200, Stefan Agner wrote:
 On 2015-04-03 01:48, Scott Wood wrote:
  On Tue, 2015-03-31 at 11:02 -0400, Bill Pringlemeir wrote:
  On 2015-03-31 00:15, Scott Wood wrote:
 
   Especially since you'd be doing one write rather than four full-page
   partial writes.  Surely the bottleneck here is the NAND chip itself,
   not copying data to the buffer?
 
  The AHB bus that the NFC controller is on is relatively slow.  Here are
  some numbers from 'AN4947-vybrid-bus-architechure',
 
  Vybrid Cortex A5 to DDR (in CPU clocks 400/500MHz),
 
 First read Subsequent
 2858  all caches on
 345269no cache, mmu
 437371no cache, no mmu
 
  The NFC is on an AHB bus 32bit, 66MHz (not AXI 64bit, 133-166MHz like
  DDR).  The AHB will be about four times slower.  Also the reads and
  writes to the physical NAND must take place serially.  Here are the
  program page steps.
 
1. Issue controller Read full page to NFC buffer.
2. Copy update partial page from DDR to NFC buffer.
3. Issue write NAND page.
 
  Why is any sort of read part of the write process?

 To recalculate the correct ECC, which is done in the controller, the
 controller has to have the page in the SRAM. I will send out a patch
 which implements vf610_nfc_write_subpage. And the read part is done when
 the MTD subsystem calls NAND_CMD_SEQIN.
 
 Again, if this is the only way you can do subpage accesses then you
 should not do them.

Why not? IMHO, there are valid reason to do it, since we save coping
data over the bus (we save copying page size - write len of bytes)...
Also, I guess all NAND controller which do HW ECC need to read at least
ECC step size back to the controller... Maybe we can move the discussion
to the actual code (see mtd: vf610_nfc: support subpage write).
 
 Actually, the Linux NAND driver supports subpage writes already by using
 the generic nand_write_subpage_hwecc function. However, in U-Boot I
 added driver specific page_read/page_write due to performance reasons:
 http://lists.denx.de/pipermail/u-boot/2014-August/186293.html

 However, I tried driver specific page_read/page_write functions for
 Linux too, but I couldn't measure noticeable performance improvements. I
 think the reason why it lead to noticeable improvements for U-Boot was
 because U-Boot does not use caches so far.
 
 If you care about U-Boot's performance at all, enabling cache would be
 the first thing I'd try...

Yes, we have that in our downstream since some months, and patch is
pending, see:
http://article.gmane.org/gmane.comp.boot-loaders.u-boot/215896

But it was not the first thing we did, we started with the functionality
needed to boot, such as reading from NAND... Hence we could now go back
and use the MTD subsystems generic functions. I do not have a strong
opinion on this...

--
Stefan
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-04-02 Thread Scott Wood
On Tue, 2015-03-31 at 11:02 -0400, Bill Pringlemeir wrote:
 On 2015-03-31 00:15, Scott Wood wrote:
 
  Especially since you'd be doing one write rather than four full-page
  partial writes.  Surely the bottleneck here is the NAND chip itself,
  not copying data to the buffer?
 
 The AHB bus that the NFC controller is on is relatively slow.  Here are
 some numbers from 'AN4947-vybrid-bus-architechure',
 
 Vybrid Cortex A5 to DDR (in CPU clocks 400/500MHz),
 
First read Subsequent
2858  all caches on
345269no cache, mmu
437371no cache, no mmu
 
 The NFC is on an AHB bus 32bit, 66MHz (not AXI 64bit, 133-166MHz like
 DDR).  The AHB will be about four times slower.  Also the reads and
 writes to the physical NAND must take place serially.  Here are the
 program page steps.
 
   1. Issue controller Read full page to NFC buffer.
   2. Copy update partial page from DDR to NFC buffer.
   3. Issue write NAND page.

Why is any sort of read part of the write process?

If this controller can't handle subpage writes properly, then disable
them.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-31 Thread Bill Pringlemeir
On 31 Mar 2015, scottw...@freescale.com wrote:

 On Tue, 2015-03-31 at 00:24 +0200, Stefan Agner wrote:
 Actually, I just realized that the driver is not caching writes.

 switch (command) {
 case NAND_CMD_PAGEPROG:
 nfc-page = -1;
 +vf610_nfc_transfer_size(nfc-regs, nfc-page_sz);
 vf610_nfc_send_commands(nfc-regs, NAND_CMD_SEQIN,
 command, PROGRAM_PAGE_CMD_CODE);
 vf610_nfc_addr_cycle(mtd, column, page);
 break;

 The nfc-page = -1 resets the page cache, so the next read triggers a
 full page read.

Yes, this is correct.  I should have looked at the code again.  It would
be interesting to see what it would do if we did cache.  I know I was
concerned about the upper layers and this caching; Ie, they may
explicitly want to check a physical re-read of a page just after
programming.

 I will check the performance consequences when disabling cache
 entirely, and whether it would be possible to implement a OOB only
 read.

 OK.  In the meantime I'll apply this.

The patches are surely an improvement; maybe not perfect.  I think this
is a good decision.

On 2015-03-31 00:15, Scott Wood wrote:

 Especially since you'd be doing one write rather than four full-page
 partial writes.  Surely the bottleneck here is the NAND chip itself,
 not copying data to the buffer?

The AHB bus that the NFC controller is on is relatively slow.  Here are
some numbers from 'AN4947-vybrid-bus-architechure',

Vybrid Cortex A5 to DDR (in CPU clocks 400/500MHz),

   First read Subsequent
   2858  all caches on
   345269no cache, mmu
   437371no cache, no mmu

The NFC is on an AHB bus 32bit, 66MHz (not AXI 64bit, 133-166MHz like
DDR).  The AHB will be about four times slower.  Also the reads and
writes to the physical NAND must take place serially.  Here are the
program page steps.

  1. Issue controller Read full page to NFC buffer.
  2. Copy update partial page from DDR to NFC buffer.
  3. Issue write NAND page.

The alternate,

  1. Issue Read full page to NFC buffer.
  2. Copy full page to DDR.
  3. Update DDR with new data.
  4. Copy updated DDR page to NFC buffer.
  5. Issue write NAND page.

This involves a lot more bus transactions with the NFC AHB as either the
source and/or destination.  To disable the 'cache' (code in vf610_nfc),
we would actually need page program to only be called with full page
data (maybe that is the case?).  This is then much better,

  1. Copy full page to NFC buffer.
  2. Write NAND page.

Probably it would be beneficial to test this in the 'NAND_CMD_SEQIN' and
not issue the 'read'.  Unfortunately, I don't think there is a way to
know if a full page is being written from the driver with the current
NAND API?  Maybe the upper layer has already read the whole page, so the
NAND_CMD_SEQIN is always called with the page as current.  Maybe,

case NAND_CMD_SEQIN: /* Pre-read for partial writes. */
+   /* Already read? */
+   if (nfc-page == page)   /* If this is always true, the */
+   return;  /* NFC caching does nothing. */
case NAND_CMD_READ0:
column = 0;
-   /* Already read? */
-   if (nfc-page == page)
-   return;
nfc-page = page;

The AHB bus speed is a similar order to the NFC (33 MHz NFC clock versus
66MHz AHB clock) with both single beat after setup; you can actually
clock the NAND faster with out hw ECC with some NAND chips supporting
~50MHz.  Initially the NFC clock was under 10MHz; at this frequency NAND
chip timing is quite dominate.  After improving the NFC clock, the
actual transfer from/to the main DDR to the NFC buffers actually becomes
significant.

Well, that is my hypothesis.  I guess some measurements are always
beneficial.  I had measured with/without 'cache' and the results were not
insignificant.  I also put 'printk()' to see what pages were being
read/written and it seemed that both 'read followed by write' and 'write
followed by read' were issued quite frequently to the same page.  I am
also pretty sure that most systems will be structured like this.

 CPU - L1 - L2? - High speed bus - DDR
 CPU  - Low speed bus  - NFC

So, I don't think that this is Vybrid specific.  The PPC, ColdFire, etc
will probably have similar issues.  DMA has the same limitations as the
CPU, with setup overhead.  Of course, you can parallel the main CPU with
DMA but many systems want the NAND to complete synchronously; especially
u-boot.

Fwiw,
Bill Pringlemeir.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Bill Pringlemeir
On 24 Mar 2015, ste...@agner.ch wrote:

 The driver tries to re-use the page buffer by storing the page
 number of the current page in the buffer. The page is only read
 if the requested page number is not currently in the buffer. When
 a block is erased, the page number is marked as invalid if the
 erased page equals the one currently in the cache. However, since
 a erase block consists of multiple pages, also other page numbers
 could be affected.

 The commands to reproduce this issue (on a written page):
 nand dump 0x800
 nand erase 0x0 0x2
 nand dump 0x800

 The second nand dump command returns the data from the buffer,
 while in fact the page is erased (0xff).

 Avoid the hassle to calculate whether the page is affected or not,
 but set the page buffer unconditionally to invalid instead.

 Signed-off-by: Stefan Agner ste...@agner.ch
 ---
 This are two bug fixes which would be nice if they would still
 make it into 2015.04...

 drivers/mtd/nand/vf610_nfc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

 diff --git a/drivers/mtd/nand/vf610_nfc.c
 b/drivers/mtd/nand/vf610_nfc.c index 928d58b..9de971c 100644 ---
 a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@
 -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd,
 unsigned command, break;

   case NAND_CMD_ERASE1: - if (nfc-page == page) - nfc-page =
 -1; + nfc-page = -1; vf610_nfc_send_commands(nfc-regs, command,
 NAND_CMD_ERASE2, ERASE_CMD_CODE); vf610_nfc_addr_cycle(mtd, column,
 page);

This change looks sensible.  It is also possible that because sub-pages
were removed that we could just remove the caching all together.  It is
possible that a higher layer may intentionally want to program and then
do a read to verify.

I had seen that different FS seem to do 'write' and then immediately
follow with a read.  If you believe the controller and the write status
was ok, then I think it is fine to reuse the existing buffer and keep
this caching.

For UBI, there were VID/EB header reads when sub-pages were enabled as
they are in the same page; but these are now seperate pages.
Especially, people may only care about code size in 'u-boot' and the
typical use will only be to read an image (or config) from NAND so this
optimization is probably not too helpful (execept maybe when flashing
new kernels).  The 2nd patch set is not needed if we do not re-use
existing data?

I guess we want to stay the same as the mainline Linux you are
submitting.  I haven't benchmarked the updates since sub-pages were
removed to see if this is worth it.  I think it was only ~10-20% in some
benchmark I was doing with the 'caching'.

At least in the small, this is a minimal change that is correct.

Ack-by: Bill Pringlemeir bpringlem...@nbsps.com

___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Scott Wood
On Mon, 2015-03-30 at 13:02 -0400, Bill Pringlemeir wrote:
 On 24 Mar 2015, ste...@agner.ch wrote:
 
  The driver tries to re-use the page buffer by storing the page
  number of the current page in the buffer. The page is only read
  if the requested page number is not currently in the buffer. When
  a block is erased, the page number is marked as invalid if the
  erased page equals the one currently in the cache. However, since
  a erase block consists of multiple pages, also other page numbers
  could be affected.
 
  The commands to reproduce this issue (on a written page):
  nand dump 0x800
  nand erase 0x0 0x2
  nand dump 0x800
 
  The second nand dump command returns the data from the buffer,
  while in fact the page is erased (0xff).
 
  Avoid the hassle to calculate whether the page is affected or not,
  but set the page buffer unconditionally to invalid instead.
 
  Signed-off-by: Stefan Agner ste...@agner.ch
  ---
  This are two bug fixes which would be nice if they would still
  make it into 2015.04...
 
  drivers/mtd/nand/vf610_nfc.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)
 
  diff --git a/drivers/mtd/nand/vf610_nfc.c
  b/drivers/mtd/nand/vf610_nfc.c index 928d58b..9de971c 100644 ---
  a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@
  -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd,
  unsigned command, break;
 
  case NAND_CMD_ERASE1: - if (nfc-page == page) - nfc-page =
  -1; + nfc-page = -1; vf610_nfc_send_commands(nfc-regs, command,
  NAND_CMD_ERASE2, ERASE_CMD_CODE); vf610_nfc_addr_cycle(mtd, column,
  page);
 
 This change looks sensible.  It is also possible that because sub-pages
 were removed that we could just remove the caching all together.  It is
 possible that a higher layer may intentionally want to program and then
 do a read to verify.

It's more than possible -- Peter Tyser posted patches to do exactly that
for command-line NAND writes.

 I had seen that different FS seem to do 'write' and then immediately
 follow with a read.  If you believe the controller and the write status
 was ok, then I think it is fine to reuse the existing buffer and keep
 this caching.

If the upper layers want to cache then let them cache.

 I guess we want to stay the same as the mainline Linux you are
 submitting.

So fix Linux. :-)

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Stefan Agner
On 2015-03-30 19:02, Bill Pringlemeir wrote:
 On 24 Mar 2015, ste...@agner.ch wrote:
 
 The driver tries to re-use the page buffer by storing the page
 number of the current page in the buffer. The page is only read
 if the requested page number is not currently in the buffer. When
 a block is erased, the page number is marked as invalid if the
 erased page equals the one currently in the cache. However, since
 a erase block consists of multiple pages, also other page numbers
 could be affected.

 The commands to reproduce this issue (on a written page):
 nand dump 0x800
 nand erase 0x0 0x2
 nand dump 0x800

 The second nand dump command returns the data from the buffer,
 while in fact the page is erased (0xff).

 Avoid the hassle to calculate whether the page is affected or not,
 but set the page buffer unconditionally to invalid instead.

 Signed-off-by: Stefan Agner ste...@agner.ch
 ---
 This are two bug fixes which would be nice if they would still
 make it into 2015.04...

 drivers/mtd/nand/vf610_nfc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

 diff --git a/drivers/mtd/nand/vf610_nfc.c
 b/drivers/mtd/nand/vf610_nfc.c index 928d58b..9de971c 100644 ---
 a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@
 -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd,
 unsigned command, break;

  case NAND_CMD_ERASE1: - if (nfc-page == page) - nfc-page =
 -1; + nfc-page = -1; vf610_nfc_send_commands(nfc-regs, command,
 NAND_CMD_ERASE2, ERASE_CMD_CODE); vf610_nfc_addr_cycle(mtd, column,
 page);
 
 This change looks sensible.  It is also possible that because sub-pages
 were removed that we could just remove the caching all together.  It is
 possible that a higher layer may intentionally want to program and then
 do a read to verify.

Hm, I now realize that this actually happened somewhat by accident. Back
when I migrated the driver to U-Boot, I removed the hwctl function since
it was an empty function. This probably lead to the problem with sub
page writes, which is why sub-page writes has been removed (by adding
NAND_NO_SUBPAGE_WRITE).

However, in order to avoid a full page getting allocated and memcpy'ed
when only writing a part of a page, we actually could re-enable that
feature. But I'm not sure about the semantics of a subpage writes: Does
the stack makes sure that the rest of the bytes are in the cache (e.g.
read before write)? From what I understand, a subpage write would only
copy (via vf610_nfc_write_buf) the data required into the the page
cache, which then gets written to the device. Who makes sure that the
rest of the page contains sound data?

 
 I had seen that different FS seem to do 'write' and then immediately
 follow with a read.  If you believe the controller and the write status
 was ok, then I think it is fine to reuse the existing buffer and keep
 this caching.
 
 For UBI, there were VID/EB header reads when sub-pages were enabled as
 they are in the same page; but these are now seperate pages.
 Especially, people may only care about code size in 'u-boot' and the
 typical use will only be to read an image (or config) from NAND so this
 optimization is probably not too helpful (execept maybe when flashing
 new kernels).  The 2nd patch set is not needed if we do not re-use
 existing data?

You mean mtd: vf610_nfc: specify transfer size before each transfer?
When removing the page cache here, it probably wouldn't be needed...
However, I think that patch would still make sense since it leads to
less data beeing copied between the NAND device and the host. Especially
the status command gets called quite often. I'm not 100% sure, but I
think the performance improvement between v3 and v4 of the Linux kernel
driver was due to that fix.


 
 I guess we want to stay the same as the mainline Linux you are
 submitting.  I haven't benchmarked the updates since sub-pages were
 removed to see if this is worth it.  I think it was only ~10-20% in some
 benchmark I was doing with the 'caching'.

I think it mainly makes sense when the higher layer reads OOB and then
the main data or visa-verse. I saw such access patterns at least in
U-Boot.

 
 At least in the small, this is a minimal change that is correct.
 
 Ack-by: Bill Pringlemeir bpringlem...@nbsps.com

Thanks for the Ack. If still possible, it would be nice to see that in
2015.04... :-)

--
Stefan
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Stefan Agner
On 2015-03-30 22:34, Scott Wood wrote:
 On Mon, 2015-03-30 at 13:02 -0400, Bill Pringlemeir wrote:
 On 24 Mar 2015, ste...@agner.ch wrote:

  The driver tries to re-use the page buffer by storing the page
  number of the current page in the buffer. The page is only read
  if the requested page number is not currently in the buffer. When
  a block is erased, the page number is marked as invalid if the
  erased page equals the one currently in the cache. However, since
  a erase block consists of multiple pages, also other page numbers
  could be affected.
 
  The commands to reproduce this issue (on a written page):
  nand dump 0x800
  nand erase 0x0 0x2
  nand dump 0x800
 
  The second nand dump command returns the data from the buffer,
  while in fact the page is erased (0xff).
 
  Avoid the hassle to calculate whether the page is affected or not,
  but set the page buffer unconditionally to invalid instead.
 
  Signed-off-by: Stefan Agner ste...@agner.ch
  ---
  This are two bug fixes which would be nice if they would still
  make it into 2015.04...
 
  drivers/mtd/nand/vf610_nfc.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)
 
  diff --git a/drivers/mtd/nand/vf610_nfc.c
  b/drivers/mtd/nand/vf610_nfc.c index 928d58b..9de971c 100644 ---
  a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@
  -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd,
  unsigned command, break;
 
 case NAND_CMD_ERASE1: - if (nfc-page == page) - nfc-page =
  -1; + nfc-page = -1; vf610_nfc_send_commands(nfc-regs, command,
  NAND_CMD_ERASE2, ERASE_CMD_CODE); vf610_nfc_addr_cycle(mtd, column,
  page);

 This change looks sensible.  It is also possible that because sub-pages
 were removed that we could just remove the caching all together.  It is
 possible that a higher layer may intentionally want to program and then
 do a read to verify.
 
 It's more than possible -- Peter Tyser posted patches to do exactly that
 for command-line NAND writes.
 
 I had seen that different FS seem to do 'write' and then immediately
 follow with a read.  If you believe the controller and the write status
 was ok, then I think it is fine to reuse the existing buffer and keep
 this caching.
 
 If the upper layers want to cache then let them cache.
 

In this case, the lower layer is doing the caching (the driver).
Depending on what the idea was behind the reread (e.g. check the amount
of ECC erros just after write?), this caching inside the driver might
miss lead the upper layers...? However, if removing the caching in the
driver would lead to a performance drop, I would rather prefer to keep
it...

 I guess we want to stay the same as the mainline Linux you are
 submitting.
 
 So fix Linux. :-)

The driver is actually not yet in Linux. That change however is included
in the latest revision. Hence, yes, we will :-)

--
Stefan
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Scott Wood
On Mon, 2015-03-30 at 22:40 +0200, Stefan Agner wrote:
 On 2015-03-30 22:34, Scott Wood wrote:
  On Mon, 2015-03-30 at 13:02 -0400, Bill Pringlemeir wrote:
  On 24 Mar 2015, ste...@agner.ch wrote:
 
   The driver tries to re-use the page buffer by storing the page
   number of the current page in the buffer. The page is only read
   if the requested page number is not currently in the buffer. When
   a block is erased, the page number is marked as invalid if the
   erased page equals the one currently in the cache. However, since
   a erase block consists of multiple pages, also other page numbers
   could be affected.
  
   The commands to reproduce this issue (on a written page):
   nand dump 0x800
   nand erase 0x0 0x2
   nand dump 0x800
  
   The second nand dump command returns the data from the buffer,
   while in fact the page is erased (0xff).
  
   Avoid the hassle to calculate whether the page is affected or not,
   but set the page buffer unconditionally to invalid instead.
  
   Signed-off-by: Stefan Agner ste...@agner.ch
   ---
   This are two bug fixes which would be nice if they would still
   make it into 2015.04...
  
   drivers/mtd/nand/vf610_nfc.c | 3 +--
   1 file changed, 1 insertion(+), 2 deletions(-)
  
   diff --git a/drivers/mtd/nand/vf610_nfc.c
   b/drivers/mtd/nand/vf610_nfc.c index 928d58b..9de971c 100644 ---
   a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@
   -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd,
   unsigned command, break;
  
case NAND_CMD_ERASE1: - if (nfc-page == page) - nfc-page =
   -1; + nfc-page = -1; vf610_nfc_send_commands(nfc-regs, command,
   NAND_CMD_ERASE2, ERASE_CMD_CODE); vf610_nfc_addr_cycle(mtd, column,
   page);
 
  This change looks sensible.  It is also possible that because sub-pages
  were removed that we could just remove the caching all together.  It is
  possible that a higher layer may intentionally want to program and then
  do a read to verify.
  
  It's more than possible -- Peter Tyser posted patches to do exactly that
  for command-line NAND writes.
  
  I had seen that different FS seem to do 'write' and then immediately
  follow with a read.  If you believe the controller and the write status
  was ok, then I think it is fine to reuse the existing buffer and keep
  this caching.
  
  If the upper layers want to cache then let them cache.
  
 
 In this case, the lower layer is doing the caching (the driver).

It shouldn't.

 Depending on what the idea was behind the reread (e.g. check the amount
 of ECC erros just after write?), this caching inside the driver might
 miss lead the upper layers...?

Yes.  It's also extra complexity that has already led to problems, as
this patchset demonstrates.

  However, if removing the caching in the driver would lead to a
 performance drop, I would rather prefer to keep it...

What is special about this controller, that caching makes sense here but
not on other controllers?  If it makes sense everywhere, then the upper
layer is the place to do it.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Bill Pringlemeir
On 30 Mar 2015, scottw...@freescale.com wrote:

 On Mon, 2015-03-30 at 22:40 +0200, Stefan Agner wrote:

 However, if removing the caching in the driver would lead to a
 performance drop, I would rather prefer to keep it...

 What is special about this controller, that caching makes sense here
 but not on other controllers?  If it makes sense everywhere, then the
 upper layer is the place to do it.

While I observed some performance differences, but that is an aside.  I
think the mxc driver is similar in sub-page (or partial page) support.
Before doing a sub-page write, it reads a full page and then copies the
sub-page update to the buffer to be reprogrammed.  This works fine if
each and every sub-page has separate OOB data for software ECC.  The
hardware ECC of this vf610_nfc controller doesn't support this.

At least that is my understanding of this [mxc_nand] code,

case NAND_CMD_SEQIN:
if (column = mtd-writesize) {
/*
 * before sending SEQIN command for partial write,
 * we need read one page out. FSL NFC does not support
 * partial write. It always sends out 512+ecc+512+ecc
 * for large page nand flash. But for small page nand
 * flash, it does support SPARE ONLY operation.
 */
if (host-pagesize_2k) {
/* call ourself to read a page */
mxc_nand_command(mtd, NAND_CMD_READ0, 0,
page_addr);
}

in 'drivers/mtd/nand/mxc_nand.c'.  So, originally this was mainly for
sub-pages (an optimization but also functionality).

So the 'caching' is just to preserve read data before a partial write as
the vf610 requires a full page (like the mxc).  The same 'caching' was
keep for 'write followed by read' when the ECC is successful.

I realize that the upper layers may not like this so I mentioned it; I
guess that is proof that the patch is getting a good review and I am not
causing problems for Stefan ;)

I also agree with Stefan that setting the SECSZ in the 2nd patch will
result in less data transfer (and maybe less current/power and system
noise).  However, this will be inside the NFC controller.  Talking over
the AHB to the NFC controller is quite slow on the Vybrid.  Certainly it
seems a little more elegant to tell the controller to transfer nothing
(and that we [programmers] expect nothing as a sort of documentation)?

I think it would be worthwhile to benchmark without the cache.  Or maybe
Stefan already has some numbers?  Upper layers doing partial pages will
definitely benefit with the 'cache'; we would also need more DDR memory
as the NFC controller memory is being used as a scratch buffer.

Fwiw,
Bill Pringlemeir.
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Stefan Agner
On 2015-03-30 22:48, Scott Wood wrote:
 On Mon, 2015-03-30 at 22:40 +0200, Stefan Agner wrote:
 On 2015-03-30 22:34, Scott Wood wrote:
  On Mon, 2015-03-30 at 13:02 -0400, Bill Pringlemeir wrote:
  On 24 Mar 2015, ste...@agner.ch wrote:
 
   The driver tries to re-use the page buffer by storing the page
   number of the current page in the buffer. The page is only read
   if the requested page number is not currently in the buffer. When
   a block is erased, the page number is marked as invalid if the
   erased page equals the one currently in the cache. However, since
   a erase block consists of multiple pages, also other page numbers
   could be affected.
  
   The commands to reproduce this issue (on a written page):
   nand dump 0x800
   nand erase 0x0 0x2
   nand dump 0x800
  
   The second nand dump command returns the data from the buffer,
   while in fact the page is erased (0xff).
  
   Avoid the hassle to calculate whether the page is affected or not,
   but set the page buffer unconditionally to invalid instead.
  
   Signed-off-by: Stefan Agner ste...@agner.ch
   ---
   This are two bug fixes which would be nice if they would still
   make it into 2015.04...
  
   drivers/mtd/nand/vf610_nfc.c | 3 +--
   1 file changed, 1 insertion(+), 2 deletions(-)
  
   diff --git a/drivers/mtd/nand/vf610_nfc.c
   b/drivers/mtd/nand/vf610_nfc.c index 928d58b..9de971c 100644 ---
   a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@
   -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd,
   unsigned command, break;
  
   case NAND_CMD_ERASE1: - if (nfc-page == page) - nfc-page =
   -1; + nfc-page = -1; vf610_nfc_send_commands(nfc-regs, command,
   NAND_CMD_ERASE2, ERASE_CMD_CODE); vf610_nfc_addr_cycle(mtd, column,
   page);
 
  This change looks sensible.  It is also possible that because sub-pages
  were removed that we could just remove the caching all together.  It is
  possible that a higher layer may intentionally want to program and then
  do a read to verify.
 
  It's more than possible -- Peter Tyser posted patches to do exactly that
  for command-line NAND writes.
 
  I had seen that different FS seem to do 'write' and then immediately
  follow with a read.  If you believe the controller and the write status
  was ok, then I think it is fine to reuse the existing buffer and keep
  this caching.
 
  If the upper layers want to cache then let them cache.
 

 In this case, the lower layer is doing the caching (the driver).
 
 It shouldn't.
 
 Depending on what the idea was behind the reread (e.g. check the amount
 of ECC erros just after write?), this caching inside the driver might
 miss lead the upper layers...?
 
 Yes.  It's also extra complexity that has already led to problems, as
 this patchset demonstrates.
 
  However, if removing the caching in the driver would lead to a
 performance drop, I would rather prefer to keep it...
 
 What is special about this controller, that caching makes sense here but
 not on other controllers?  If it makes sense everywhere, then the upper
 layer is the place to do it.
 

Well, its not a caching which could be handled by upper layers: The NFC
reads the data into SRAM, where it gets hardware ECC checked. After
that, the hardware would be capable of copy the data into main memory
using DMA, however, the driver currently is not using this
functionality. Instead, we use memcpy, and copy only the data which are
requested.

As far as I remember, Bill once mentioned that memcpy using the CPU is
actually faster than DMA (while of course keeping the CPU busy, but that
is not really a problem for U-Boot since we anyway do not use interrupt
mode).

The driver currently stores the page number which was read the last
time. In case the upper layer request a read command for the same page,
we ignore that request.

I unify the discussion of the other thread, since its related:

 
 However, in order to avoid a full page getting allocated and memcpy'ed
 when only writing a part of a page, we actually could re-enable that
 feature. But I'm not sure about the semantics of a subpage writes: Does
 the stack makes sure that the rest of the bytes are in the cache (e.g.
 read before write)? From what I understand, a subpage write would only
 copy (via vf610_nfc_write_buf) the data required into the the page
 cache, which then gets written to the device. Who makes sure that the
 rest of the page contains sound data?
 
 If the rest of the page is all 0xff it shouldn't affect the existing
 data -- as long as the controller isn't writing some non-0xff ECC bytes
 as a result.
 
 It's moot if subpage writes are disabled, though.
 

Hm, and if its not 0xff? Is a sub page write valid in that case?

If a subpage write means that the rest of the page is 0xff, we could
also memset the whole SRAM 0xff and copy only the subpage data into the
buffer. We would still need to write the whole page, but only read parts
of it from main memory. OTH, when we 

Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Stefan Agner
On 2015-03-31 00:15, Scott Wood wrote:
 On Mon, 2015-03-30 at 23:26 +0200, Stefan Agner wrote:
 On 2015-03-30 22:48, Scott Wood wrote:
  What is special about this controller, that caching makes sense here but
  not on other controllers?  If it makes sense everywhere, then the upper
  layer is the place to do it.
 

 Well, its not a caching which could be handled by upper layers: The NFC
 reads the data into SRAM, where it gets hardware ECC checked. After
 that, the hardware would be capable of copy the data into main memory
 using DMA, however, the driver currently is not using this
 functionality. Instead, we use memcpy, and copy only the data which are
 requested.
 
 The upper layer could choose to read the whole page at once.
 
  However, in order to avoid a full page getting allocated and memcpy'ed
  when only writing a part of a page, we actually could re-enable that
  feature. But I'm not sure about the semantics of a subpage writes: Does
  the stack makes sure that the rest of the bytes are in the cache (e.g.
  read before write)? From what I understand, a subpage write would only
  copy (via vf610_nfc_write_buf) the data required into the the page
  cache, which then gets written to the device. Who makes sure that the
  rest of the page contains sound data?
 
  If the rest of the page is all 0xff it shouldn't affect the existing
  data -- as long as the controller isn't writing some non-0xff ECC bytes
  as a result.
 
  It's moot if subpage writes are disabled, though.
 

 Hm, and if its not 0xff? Is a sub page write valid in that case?
 
 It's also OK if it matches what's already there.  Otherwise, it will
 corrupt.
 
 If a subpage write means that the rest of the page is 0xff, we could
 also memset the whole SRAM 0xff and copy only the subpage data into the
 buffer. We would still need to write the whole page, but only read parts
 of it from main memory.
 
 The generic NAND code already does this -- see the memset() in
 nand_do_write_ops().  In Linux we rely on this in the eLBC driver
 because it worked by accident and disabling subpages now would break UBI
 compatibility.
 
  OTH, when we disable subpage write, and the stack prepares the whole
 page, the page is probably still in the CPU cache anyway, and hence the
 copy operation would be nearly as fast as memset/and partly memcpy
 Probably not worth the whole effort.
 
 Especially since you'd be doing one write rather than four full-page
 partial writes.  Surely the bottleneck here is the NAND chip itself,
 not copying data to the buffer?
 

Of course, if sub-page writes are supported, this would be faster. But
that is not the case (the controller has something called virtual pages
for pages over 2K, but I guess that qualifies not as sub-page write).

Afaik, everything happens sequentially, so every operation is hurting
the overall performance.

  I think it mainly makes sense when the higher layer reads OOB and then
  the main data or visa-verse. I saw such access patterns at least in
  U-Boot.
 
  Wouldn't it make more sense to not read a full page every time OOB is
  read?

 I think when ECC is enabled this is not possible. However, since OOB is
 not ECC checked, we could also turn off ECC and read only the OOB.
 However, I'm also not sure if that is supported by the controller, I
 need to check that. If it both is not possible, I would argue that
 remembering the page in SRAM is worth to effort to speed up page = OOB
 or OOB = page access...
 
 If the controller can't support reading OOB by itself, that would be an
 answer to what is different about this controller...  Though if
 caching is done to deal with that, it should be limited to only dealing
 with consecutive reads.  Don't cache writes.

Actually, I just realized that the driver is not caching writes.

 switch (command) {
 case NAND_CMD_PAGEPROG:
 nfc-page = -1;
+vf610_nfc_transfer_size(nfc-regs, nfc-page_sz);
 vf610_nfc_send_commands(nfc-regs, NAND_CMD_SEQIN,
 command, PROGRAM_PAGE_CMD_CODE);
 vf610_nfc_addr_cycle(mtd, column, page);
 break;

The nfc-page = -1 resets the page cache, so the next read triggers a
full page read.

I will check the performance consequences when disabling cache entirely,
and whether it would be possible to implement a OOB only read.

--
Stefan
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Scott Wood
On Mon, 2015-03-30 at 22:14 +0200, Stefan Agner wrote:
 On 2015-03-30 19:02, Bill Pringlemeir wrote:
  On 24 Mar 2015, ste...@agner.ch wrote:
  
  The driver tries to re-use the page buffer by storing the page
  number of the current page in the buffer. The page is only read
  if the requested page number is not currently in the buffer. When
  a block is erased, the page number is marked as invalid if the
  erased page equals the one currently in the cache. However, since
  a erase block consists of multiple pages, also other page numbers
  could be affected.
 
  The commands to reproduce this issue (on a written page):
  nand dump 0x800
  nand erase 0x0 0x2
  nand dump 0x800
 
  The second nand dump command returns the data from the buffer,
  while in fact the page is erased (0xff).
 
  Avoid the hassle to calculate whether the page is affected or not,
  but set the page buffer unconditionally to invalid instead.
 
  Signed-off-by: Stefan Agner ste...@agner.ch
  ---
  This are two bug fixes which would be nice if they would still
  make it into 2015.04...
 
  drivers/mtd/nand/vf610_nfc.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)
 
  diff --git a/drivers/mtd/nand/vf610_nfc.c
  b/drivers/mtd/nand/vf610_nfc.c index 928d58b..9de971c 100644 ---
  a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@
  -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd,
  unsigned command, break;
 
 case NAND_CMD_ERASE1: - if (nfc-page == page) - nfc-page =
  -1; + nfc-page = -1; vf610_nfc_send_commands(nfc-regs, command,
  NAND_CMD_ERASE2, ERASE_CMD_CODE); vf610_nfc_addr_cycle(mtd, column,
  page);
  
  This change looks sensible.  It is also possible that because sub-pages
  were removed that we could just remove the caching all together.  It is
  possible that a higher layer may intentionally want to program and then
  do a read to verify.
 
 Hm, I now realize that this actually happened somewhat by accident. Back
 when I migrated the driver to U-Boot, I removed the hwctl function since
 it was an empty function. This probably lead to the problem with sub
 page writes, which is why sub-page writes has been removed (by adding
 NAND_NO_SUBPAGE_WRITE).

Subpages can be supported without hwctl, by implementing the appropriate
commands -- if the hardware supports it (e.g. some controllers only want
to do ECC on full pages).  There's a comment in the driver saying that
NFC does not support sub-page reads and writes.

If hwctl was empty it probably means that this controller does not
expose an interface that matches well with hwctl.

 However, in order to avoid a full page getting allocated and memcpy'ed
 when only writing a part of a page, we actually could re-enable that
 feature. But I'm not sure about the semantics of a subpage writes: Does
 the stack makes sure that the rest of the bytes are in the cache (e.g.
 read before write)? From what I understand, a subpage write would only
 copy (via vf610_nfc_write_buf) the data required into the the page
 cache, which then gets written to the device. Who makes sure that the
 rest of the page contains sound data?

If the rest of the page is all 0xff it shouldn't affect the existing
data -- as long as the controller isn't writing some non-0xff ECC bytes
as a result.

It's moot if subpage writes are disabled, though.

  I guess we want to stay the same as the mainline Linux you are
  submitting.  I haven't benchmarked the updates since sub-pages were
  removed to see if this is worth it.  I think it was only ~10-20% in some
  benchmark I was doing with the 'caching'.
 
 I think it mainly makes sense when the higher layer reads OOB and then
 the main data or visa-verse. I saw such access patterns at least in
 U-Boot.

Wouldn't it make more sense to not read a full page every time OOB is
read?

  At least in the small, this is a minimal change that is correct.
  
  Ack-by: Bill Pringlemeir bpringlem...@nbsps.com
 
 Thanks for the Ack. If still possible, it would be nice to see that in
 2015.04... :-)

I'd rather see the caching removed entirely.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Scott Wood
On Mon, 2015-03-30 at 23:26 +0200, Stefan Agner wrote:
 On 2015-03-30 22:48, Scott Wood wrote:
  What is special about this controller, that caching makes sense here but
  not on other controllers?  If it makes sense everywhere, then the upper
  layer is the place to do it.
  
 
 Well, its not a caching which could be handled by upper layers: The NFC
 reads the data into SRAM, where it gets hardware ECC checked. After
 that, the hardware would be capable of copy the data into main memory
 using DMA, however, the driver currently is not using this
 functionality. Instead, we use memcpy, and copy only the data which are
 requested.

The upper layer could choose to read the whole page at once.

  However, in order to avoid a full page getting allocated and memcpy'ed
  when only writing a part of a page, we actually could re-enable that
  feature. But I'm not sure about the semantics of a subpage writes: Does
  the stack makes sure that the rest of the bytes are in the cache (e.g.
  read before write)? From what I understand, a subpage write would only
  copy (via vf610_nfc_write_buf) the data required into the the page
  cache, which then gets written to the device. Who makes sure that the
  rest of the page contains sound data?
  
  If the rest of the page is all 0xff it shouldn't affect the existing
  data -- as long as the controller isn't writing some non-0xff ECC bytes
  as a result.
  
  It's moot if subpage writes are disabled, though.
  
 
 Hm, and if its not 0xff? Is a sub page write valid in that case?

It's also OK if it matches what's already there.  Otherwise, it will
corrupt.

 If a subpage write means that the rest of the page is 0xff, we could
 also memset the whole SRAM 0xff and copy only the subpage data into the
 buffer. We would still need to write the whole page, but only read parts
 of it from main memory.

The generic NAND code already does this -- see the memset() in
nand_do_write_ops().  In Linux we rely on this in the eLBC driver
because it worked by accident and disabling subpages now would break UBI
compatibility.

  OTH, when we disable subpage write, and the stack prepares the whole
 page, the page is probably still in the CPU cache anyway, and hence the
 copy operation would be nearly as fast as memset/and partly memcpy
 Probably not worth the whole effort.

Especially since you'd be doing one write rather than four full-page
partial writes.  Surely the bottleneck here is the NAND chip itself,
not copying data to the buffer?

  I think it mainly makes sense when the higher layer reads OOB and then
  the main data or visa-verse. I saw such access patterns at least in
  U-Boot.
  
  Wouldn't it make more sense to not read a full page every time OOB is
  read?
 
 I think when ECC is enabled this is not possible. However, since OOB is
 not ECC checked, we could also turn off ECC and read only the OOB.
 However, I'm also not sure if that is supported by the controller, I
 need to check that. If it both is not possible, I would argue that
 remembering the page in SRAM is worth to effort to speed up page = OOB
 or OOB = page access...

If the controller can't support reading OOB by itself, that would be an
answer to what is different about this controller...  Though if
caching is done to deal with that, it should be limited to only dealing
with consecutive reads.  Don't cache writes.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-30 Thread Scott Wood
On Tue, 2015-03-31 at 00:24 +0200, Stefan Agner wrote:
 Actually, I just realized that the driver is not caching writes.
 
  switch (command) {
  case NAND_CMD_PAGEPROG:
  nfc-page = -1;
 +vf610_nfc_transfer_size(nfc-regs, nfc-page_sz);
  vf610_nfc_send_commands(nfc-regs, NAND_CMD_SEQIN,
  command, PROGRAM_PAGE_CMD_CODE);
  vf610_nfc_addr_cycle(mtd, column, page);
  break;
 
 The nfc-page = -1 resets the page cache, so the next read triggers a
 full page read.

 I will check the performance consequences when disabling cache entirely,
 and whether it would be possible to implement a OOB only read.

OK.  In the meantime I'll apply this.

-Scott


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


[U-Boot] [PATCH 1/2] mtd: vf610_nfc: mark page as dirty on block erase

2015-03-24 Thread Stefan Agner
The driver tries to re-use the page buffer by storing the page
number of the current page in the buffer. The page is only read
if the requested page number is not currently in the buffer. When
a block is erased, the page number is marked as invalid if the
erased page equals the one currently in the cache. However, since
a erase block consists of multiple pages, also other page numbers
could be affected.

The commands to reproduce this issue (on a written page):
 nand dump 0x800
 nand erase 0x0 0x2
 nand dump 0x800

The second nand dump command returns the data from the buffer,
while in fact the page is erased (0xff).

Avoid the hassle to calculate whether the page is affected or not,
but set the page buffer unconditionally to invalid instead.

Signed-off-by: Stefan Agner ste...@agner.ch
---
This are two bug fixes which would be nice if they would still
make it into 2015.04...

 drivers/mtd/nand/vf610_nfc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/vf610_nfc.c b/drivers/mtd/nand/vf610_nfc.c
index 928d58b..9de971c 100644
--- a/drivers/mtd/nand/vf610_nfc.c
+++ b/drivers/mtd/nand/vf610_nfc.c
@@ -369,8 +369,7 @@ static void vf610_nfc_command(struct mtd_info *mtd, 
unsigned command,
break;
 
case NAND_CMD_ERASE1:
-   if (nfc-page == page)
-   nfc-page = -1;
+   nfc-page = -1;
vf610_nfc_send_commands(nfc-regs, command,
NAND_CMD_ERASE2, ERASE_CMD_CODE);
vf610_nfc_addr_cycle(mtd, column, page);
-- 
2.3.3

___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot