Re: Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)

2012-03-05 Thread Orjan Friberg

On 03/02/2012 06:17 PM, Grazvydas Ignotas wrote:

IIRC NAND in mainline was broken for very long time on OMAP3, I think
it was only fixed in 2.6.39.1.


That seems to be the case; the 2.6.39.1 diff contains the OMAP NAND sub 
page write fix (applied locally).



Anyone else who can testify on the volatile-ness of NAND ECC errors?

I.e., are they expected to be more persistent?


Thanks,
Orjan

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)

2012-03-05 Thread Orjan Friberg

On 03/05/2012 09:56 AM, Matthieu CASTET wrote:

Note that the omap driver is still broken :
http://article.gmane.org/gmane.linux.drivers.mtd/36079/match=

We detected this when stressing a board.

Because all of these bugs in omap driver, I wonder how many people really use
the mainline version.


Do you know any repo where this is working correctly (linux-omap, or one 
of the vendor trees etc)?



Also if you use a nand that need 4-bit ECC, you need a better ecc than hamming.
You can use the bch code (
http://article.gmane.org/gmane.linux.drivers.mtd/37864/match=omap )


Yes, I've been looking at the BCH 4-bit code (both generic 
implementations and the OMAP GPMC-enabled one) in u-boot and linux.



--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)

2012-03-02 Thread Orjan Friberg

Hi,

When running the mtd_subpagetest I'm seeing more or less spurious ECC 
corrections.  I.e., one round may show 4 corrections and the next will 
show 7, only some of which are the same as the previous 4.


Are the ECC errors expected to be that volatile and frequent?


I've seen various discussions regarding the OMAP sub page support, as 
well as problems with the GPMC prefetch engine.  Disabling both made no 
difference regarding this.  I've also tried two different sets of NAND 
timings (relaxed and optimized), with no difference.


I'm using a Micron NAND that requires 4-bit ECC correction but I'm 
running with only 1-bit (software) ECC.  This is on an old kernel, 2.6.32.


Thanks,
Orjan

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)

2012-03-02 Thread Orjan Friberg

On 03/02/2012 05:17 PM, Orjan Friberg wrote:

Hi,

When running the mtd_subpagetest I'm seeing more or less spurious ECC
corrections.  I.e., one round may show 4 corrections and the next will
show 7, only some of which are the same as the previous 4.


FWIW

* I'm seeing the same behaviour (i.e. transient ECC errors) when doing 
nanddump on a partition.
* mtd_oobtest fails on verify failed at varying address, and read 
past end of device.



--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CONFIG_PREEMPT and JFFS2 oops

2012-01-26 Thread Orjan Friberg

On 01/25/2012 10:02 PM, Orjan Friberg wrote:

That one-liner was boiled down from the following program, which still
oopses instantly:


The C program seems to work fine with CONFIG_PREEMPT_NONE=y.

If that is indeed the problem I guess it's reasonable that it worked
better with PREEMPT_VOLUNTARY than PREEMPT because there are fewer
preemtion points.

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CONFIG_PREEMPT and JFFS2 oops

2012-01-26 Thread Orjan Friberg

On 01/25/2012 10:18 PM, Paul Walmsley wrote:

- If your oopses are consistently in the same places, add some debugging
   to that code to determine which line is actually causing the oops.


(CC:d linux-mtd.)

They are semi-consistent I'd say.  The oops trace I posted is by far the 
most common.



   problem to mysteriously disappear.  Doing this analysis should provide a
   good clue as to where to look next.  I personally would be rather
   suspicious of that

ri-data_crc = cpu_to_je32(crc32(0, comprbuf, cdatalen));

   in jffs2_write_inode_range().


That is indeed the place where crc32 is called from .  I'll see it I can 
track the use of comprbuf.



- Try turning on JFFS2 debugging and seeing if you can reproduce it.
   The output might provide a clue as to where the problem would be.


Here are two examples (immediately preceding the oops):

jffs2_reserve_space(): Requested 0x30 bytes
jffs2_reserve_space(): alloc sem got
[JFFS2 DBG] (1189) jffs2_do_reserve_space: minsize=48 , jeb-free=46852 
,summary-size=16586 , sumsize=29

jffs2_do_reserve_space(): Giving 0x75f4 bytes at 0x3d48fc
jffs2_write_dirent(ino #1, name at *0xdea7b93c file1-ino #111, 
name_crc 0x58c597f8)



jffs2_write_begin()
jffs2_read_inode_range: ino #12, range 0x-0x1000
Filling non-frag hole from 0-4096
end write_begin(). pg-flags 9
jffs2_write_end(): ino #12, page at 0x0, range 0-800, flags d
jffs2_write_inode_range(): Ino #12, ofs 0x0, len 0x320
jffs2_reserve_space(): Requested 0xc4 bytes
jffs2_reserve_space(): alloc sem got
[JFFS2 DBG] (1454) jffs2_do_reserve_space: minsize=196 , 
jeb-free=123148 ,summary-size=1567 , sumsize=18

jffs2_do_reserve_space(): Giving 0x1dab0 bytes at 0xf941ef4
calling deflate with avail_in 788, avail_out 788
deflate returned with avail_in 0, avail_out 428, total_in 788, total_out 360
calling deflate with avail_in 12, avail_out 428
deflate returned with avail_in 0, avail_out 414, total_in 800, total_out 374
zlib compressed 800 bytes into 380


I'll take a look at what jffs2_do_reserve_space is up to.


Thanks.

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CONFIG_PREEMPT and JFFS2 oops

2012-01-26 Thread Orjan Friberg

On 01/26/2012 11:15 AM, Orjan Friberg wrote:

problem to mysteriously disappear.  Doing this analysis should provide a
good clue as to where to look next.  I personally would be rather
suspicious of that

ri-data_crc = cpu_to_je32(crc32(0, comprbuf, cdatalen));

in jffs2_write_inode_range().


That is indeed the place where crc32 is called from .  I'll see it I can
track the use of comprbuf.


Ok, so comprbuf comes from jffs2_compress and becomes NULL for some 
reason (hence the oops).


Initially I had CMODE_FAVOUR_LZO.  With that, things only worked with 
PREEMPT_NONE.  However, when changing to CMODE_PRIORITY or CMODE_NONE 
things do seem to work *with* PREEMPT.


For what it's worth (with PREEMPT on):

CMODE_FAVOUR_LZO with LZO disabled oopses.
CMODE_FAVOUR_LZO with only ZLIB enabled oopses.
CMODE_FAVOUR_LZO with ZLIB/LZO/RTIME/RUBIN disabled does not oops.

Thus, the bug seems to be in the *selection* of compression algorithm 
(when there is at least one algoritm in the list), rather than in the 
specific compression algorithms themselves.



--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CONFIG_PREEMPT and JFFS2 oops

2012-01-26 Thread Orjan Friberg

Paul,

Your patch works fine in that it doesn't oops, and I'm not seeing any 
BUGs from CONFIG_DEBUG_SPINLOCK.  I haven't verified *anything else* 
(performance etc).


We've had some discussions on the linux-mtd list during the day, 
starting at 
http://lists.infradead.org/pipermail/linux-mtd/2012-January/039442.html 
if you're interested (though that discussion didn't result in a patch).


Thanks,
Orjan


--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CONFIG_PREEMPT and JFFS2 oops

2012-01-26 Thread Orjan Friberg

On 01/26/2012 05:57 PM, Paul Walmsley wrote:


You just throw away best_buf here, don't you?


You're right.  It's even worse than that.  best_buf will contain the data
from the last compressor used.  And it will be prematurely freed.  Here's
a fixed version.


I've tested this version for a while now with the same result as before.

No oopses, no spinlock violations.  I copied a 2MB file from the SD/MMC 
partition to the two JFFS2 partitions and md5summ'ed it a bunch of 
times.  After that I unmounted and remounted both partitions.


I do see a steady memory usage increase when doing continuous testing, 
but whether that's normal I don't know.  I see at least some of it being 
reclaimed when unmounting the JFFS2 partitions (grep jffs2 /proc/slabinfo).


--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


CONFIG_PREEMPT and JFFS2 oops

2012-01-25 Thread Orjan Friberg

Hi,

With CONFIG_PREEMPT=y and hammering away on two different JFFS2 
partitions on a NAND flash I get an oops within ~10 seconds.  This is on 
a BeagleBoard xM (rev A2, with NAND).


I've boiled it down to whether CONFIG_PREEMPT (bug happens) or
CONFIG_PREEMPT_VOLUNTARY (bug doesn't happen) is selected.  Of course,
changing that affects a other things like inline spinlocking.  Turning 
on CONFIG_DEBUG_SPINLOCK reveals nothing.



By changing this option, I've made the bug go away in a 2.6.32 and
2.6.37 setup where it previously happened, and I've made it appear in a
2.6.39 setup where it previously didn't happen.


Pointers on what to look at next are appreciated.  (I've posted this on 
the mtd-utils mailing list too.)  More details below.



Thanks,
Orjan


The setup is simply two JFFS2-formatted partitions, and launching a

  while :; do dd if=/dev/zero of=file bs=800 count=1; done

on each of them.  Sometimes the oops trace originates from the garbage 
collector, sometimes the result is a JFFS2 decompress error.



--
Orjan Friberg
FlatFrog Laboratories AB
[   81.200805] Unable to handle kernel NULL pointer dereference at virtual 
address 
[   81.217529] pgd = ce13c000
[   81.220855] [] *pgd=8e172031, *pte=, *ppte=
[   81.236480] Internal error: Oops: 17 [#1] PREEMPT
[   81.241210] last sysfs file: /sys/kernel/uevent_seqnum
[   81.246368] Modules linked in: ftdi_sio usbserial
[   81.251129] CPU: 0Not tainted  (2.6.32 #6)
[   81.255584] PC is at crc32_le+0x6c/0xf4
[   81.259460] LR is at jffs2_write_inode_range+0x2a0/0x420
[   81.264801] pc : [c0211f28]lr : [c01ae930]psr: 2013
[   81.264801] sp : ce24bcd0  ip : 0001  fp : ce11f840
[   81.276336] r10: 000c  r9 : ce5231d0  r8 : fffc
[   81.281585] r7 : 0002  r6 :   r5 : c03fcf9c  r4 : 
[   81.288146] r3 :   r2 : 0008  r1 :   r0 : 
[   81.294677] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   81.301849] Control: 10c5387d  Table: 8e13c019  DAC: 0015
[   81.307617] Process dd (pid: 5270, stack limit = 0xce24a2f0)
[   81.313323] Stack: (0xce24bcd0 to 0xce24c000)
[   81.317687] bcc0:  0002 
0003 
[   81.325897] bce0:  c01ae930 ce24bd1c ce24bd18  0008 
 
[   81.334136] bd00:  0002 cdca7000 ce1a8800   
0008 0320
[   81.342346] bd20: 0001326c  0320  ce11f840 ce523208 
 c07754e0
[   81.350555] bd40: 0320  ce1a8800 c01a8ac4  0320 
ce24bd74 
[   81.358764] bd60:  0320   0320  
0320 0320
[   81.367004] bd80:    0320   
ce5232b0 c0097d1c
[   81.375213] bda0: 0320 0320 c07754e0 ce523208 ce24a000 cebf4140 
ce5232b0 1000
[   81.383422] bdc0:  c03efe38 ce24bf40 0001  0320 
ce523208 c07754e0
[   81.391632] bde0: 0320 0320  0320 ce523208  
 
[   81.399871] be00:  c009846c   ce24bf00 0320 
 
[   81.408081] be20: 0002 ce24bf00 ce24bf40 ce24beb0 cebf4140 ce5232b0 
0320 0001
[   81.416290] be40: ce24a000 ce523278 000ad008 c03dd658  0320 
 ce523278
[   81.424530] be60: ce24bf40 ce24beb0 0001  cebf4140  
000ad008 c009851c
[   81.432739] be80: ce24beb0 ce24bf40   ce24beb0 cebf4140 
ce24bf80 ce24a000
[   81.440948] bea0: 000aad28 c00bf584   00020242 ce1ae000 
 0001
[   81.449157] bec0:  cebf4140     
ce12d6c0 00020241
[   81.457397] bee0:   0200 ce12d6c0 c0077028 ce24bef4 
ce24bef4 0004
[   81.465606] bf00:   000aad28 0300   
0320 00100073
[   81.473815] bf20: 000ad000 ce24a000 000ce000  0002 ceb450e0 
ce4b0618 0001
[   81.482025] bf40: 000ad008 0320 cebf4140 000ad008 ce24bf80 0320 
0320 c00c01c8
[   81.490264] bf60: cebf4140 000ad008   cebf4140 0320 
000ad008 c00c036c
[   81.498474] bf80:   0320  0320 0001 
000ad008 0004
[   81.506683] bfa0: c00390c4 c0038f40 0320 0001 0001 000ad008 
0320 000acd34
[   81.514923] bfc0: 0320 0001 000ad008 0004 0320 000ad008 
000aad28 000ad008
[   81.523132] bfe0: 4001e3e0 bece4b60 00010e34 40188abc 6010 0001 
 
[   81.531372] [c0211f28] (crc32_le+0x6c/0xf4) from [c01ae930] 
(jffs2_write_inode_range+0x2a0/0x420)
[   81.540618] [c01ae930] (jffs2_write_inode_range+0x2a0/0x420) from 
[c01a8ac4] (jffs2_write_end+0x190/0x2d4)
[   81.550689] [c01a8ac4] (jffs2_write_end+0x190/0x2d4) from [c0097d1c] 
(generic_file_buffered_write+0x180/0x264)
[   81.561096] [c0097d1c

Re: CONFIG_PREEMPT and JFFS2 oops

2012-01-25 Thread Orjan Friberg

On 01/25/2012 09:12 PM, Orjan Friberg wrote:

I've boiled it down to whether CONFIG_PREEMPT (bug happens) or
CONFIG_PREEMPT_VOLUNTARY (bug doesn't happen) is selected.


No, I haven't.  The problem disappeared only for

   while :; do dd if=/dev/zero of=file bs=800 count=1; done

That one-liner was boiled down from the following program, which still
oopses instantly:

   #include stdio.h
   #include unistd.h
   #include sys/types.h
   #include sys/stat.h
   #include fcntl.h

   int main()
   {
 int fd;
 struct stat st;
 char buf[800];

 do {
   unlink(file2);
   fd = open(file1, O_RDWR|O_CREAT|O_TRUNC, 0666);
   stat(file1, st);
   lseek(fd, 0, SEEK_SET);
   write(fd, buf, 800);
   close(fd);
   rename(file1, file2);
 } while (1);

 return 0;
   }


(Apologies for spamming.)


--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


USB gadget unreliable on software reboot (BeagleBoard xM)

2011-10-04 Thread Orjan Friberg

Hi,

On my BeagleBoard xM, configuring the MUSB controller in Linux to 
peripheral mode (i.e. not OTG mode) and using a built-in gadget driver, 
the gadget device sometimes does not appear after a software reboot.


I've seen this with both 2.6.32 and 2.6.39 (Angstrom 2008.1 and 2010.x 
distros, respectively).


Our own board exhibits the same behaviour.  However: configuring the 
MUSB controller in u-boot as a device and only booting as far as u-boot 
before a software reset, the device always appears.  To me this suggests 
a MUSB driver issue in Linux (as opposed to, say, PHY initialization).



I checked with a USB analyzer what happens on the bus: when it doesn't 
show up there is a reset on the bus when we reboot, but it doesn't 
re-enter full speed mode.  No SOFs are sent either.


I did a rudimentary check of the OTG registers in the TPS chip (over 
i2c) but saw nothing out of the ordinary.


I set musb_debug = 5 in musb_core.c, but no errors are reported.

I haven't looked at the MUSB controller registers yet; that's next.


Any ideas?


Thanks,
Orjan

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


copy_to_user speed from dma_alloc_coherent vs. kmalloc buffer

2011-04-20 Thread Orjan Friberg

Hi,

I have a driver where I do memory to memory DMA between GPMC and SDRAM. 
 Adding a read function, I found that copy_to_user from a 
dma_alloc_coherent buffer is significantly slower than from a kmalloc'd one.


Looking at arch/arm/include/asm/pgtable.h I suspect this difference in 
speed is due to the fact that the dma_alloc_coherent buffer is unbuffered.


What are my options (besides using mmap)?

* Reserve a portion of memory at boot time to be used as the DMA 
destination buffer, use ioremap_cached + manual cache flush as needed?
* Turn on buffering for the DMA destination buffer for the duration of 
the copy_to_user call, then turn it off again (and flush it from the cache)?

* Something else entirely?


This is on a 3730, on Linux 2.6.32.

Thanks,
Orjan

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_to_user speed from dma_alloc_coherent vs. kmalloc buffer

2011-04-20 Thread Orjan Friberg

On 2011-04-20 17:12, Orjan Friberg wrote:

What are my options (besides using mmap)?


It looks like kmalloc + dma_map_single for the DMA destination buffer 
and then dma_sync_single_for_{cpu,device} around the call to 
copy_to_user pretty much does the trick.  At least the %sys load 
measured with mpstat goes from 13% to 2%.


--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


OMAP 3730 200 MHz SDRAM config

2011-03-07 Thread Orjan Friberg

Hi,

I'm looking at configuring an OMAP 3730 board for 200 MHz SDRAM.

I've been looking at the kernel code (arch/arm/mach-omap2) the last
couple of days to try and figure out what I need to do.  We're basing 
ourselves off of the Beagleboard, so I tried copying the 200 MHz Hynix 
SDRAM entry for Beagleboard-xM but that didn't help: it still 
(re)programs the SDRC clock to 166 MHz.


* Does the kernel at all use or depend on the boot loader's SDRAM
config?  (I'm using u-boot with a prepended configuration header.)

* Does the SDRAM setup/clocking depend on the MPU rate at all?  I.e. do
I need to boot Linux in 1 GHz to be able to set 200 MHz SDRC clock?

The clock config is a bit convoluted, so I'd appreciate any help.


Thanks,
Orjan



Appendix:

I'm using a program (user-mode app) called 'bandwidth' (which has an ARM
port):
http://home.comcast.net/~fbui/bandwidth.html for measurements.

With big (several MB) sequential writes I get ~1170 MB/s.  The
theoretical max for a 166 MHz is 166*2 * 4 bytes = 1328 MB/s, so we're
almost at 90%.  We're not the only process accessing memory, and maybe
there's some loss due to SDRAM refresh etc.

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OMAP 3730 200 MHz SDRAM config

2011-03-07 Thread Orjan Friberg

On 2011-03-07 16:19, Elvis Dowson wrote:

You probably need update x-loader. Try using the beagleboard x-loader project 
located at gitorious (v1.44) or the ti arago one (1.48, but not quite the 
latest in terms of support for beagleboard xm parts).

Looking at board/omap3530beagle/omap3530beagle.c for the memory part 
definitions. For the XM, the Numonyx part is at 165Mhz, and the Micron part is 
at 200Mhz.


I'm using u-boot with a configuration header, and there I have set the 
new CTRLA, CTRLB and RFR values (and I did compare the values with the 
Micron data sheet; apart from the TCKE value they are all identical).


But are you saying that the values set by the boot loader are preserved 
by the kernel?  (In that case I wonder what the sdram-micron header file 
is for :)



Thanks,
Orjan

--
Orjan Friberg
FlatFrog Laboratories AB
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html