Re: ohci1394 broke 2.6.19 -> 2.6.20-rc1
On 2/5/07, Stefan Richter <[EMAIL PROTECTED]> wrote: It's my oversight, see patch. Yes, this fixes things. Thanks! -- Robert Crocombe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ohci1394 broke 2.6.19 -> 2.6.20-rc1
Prior to testing a patch for bugzilla bug 7569 (hosts lost on bus reset), I wanted to reproduce the behavior. I can under the noted 2.6.16-blah kernels, but moving to anything more recent than 2.6.19 means ohci1394 is non-functional (no 1394 hosts are detected) and the module cannot be removed. I have narrowed it down to 2.6.19 works, 2.6.20-rc1 doesn't. Lots of detail at: http://bugzilla.kernel.org/show_bug.cgi?id=7942 -- Robert Crocombe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: In-tree version of new FireWire drivers available
On 1/25/07, Pieter Palmers <[EMAIL PROTECTED]> wrote: I'd like to make one note here: We should have a way to use smaller DMA buffers than one page size. If I remember correctly, the page size on my system is 4096 bytes, being 1024 quadlets. If we assume a 4 channel audio stream, this corresponds to 256 audio samples. This means that the controller generates an interrupt every 256 samples, making that we can achieve a latency of 512 samples at best. This is unacceptable in a pro-audio environment. The current stack exhibits this problem, and I solve it by recalculating the max packet size, based upon the stream composition (i.e. expected packet size) and the requested audio buffer size, such that the interrupts are generated at a high enough frequency. I'm not a kernel hacker, but when looking through the code I had the impression that smaller DMA buffers were possible (aren't smaller buffers used in packet-per-buffer mode?). I am using isochronous receive in RAW1394_DMA_PACKET_PER_BUFFER mode because I am closing a simulation loop around the data that is received/transmitted. Just for giggles I cranked up a test isochronous stream from a bus analyzer at 1kB per packet at 8kHz at the S400 rate (i.e., one packet on each cycle start: 8MBps ), set the machine up to listen, and was able to maintain 8kHz interrupts at ~12% CPU utilization on a 2.8GHz Opteron. 1744719 interrupts int 218.112 seconds is 7999.193 ints/sec I wasn't doing anything with the data for this test, but I have had the aforementioned sim running steady at a somewhat lower rate. This test ran under 2.6.20-rc5-rt10, but the more "productiony" system is on 2.6.16-rt29. So hopefully you can get markedly lower latencies. Myself, I'm tickled pink by the performance that can be achieved. -- Robert Crocombe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19.1-rt15: BUG in __tasklet_action at kernel/softirq.c:568
On 12/19/06, Ingo Molnar <[EMAIL PROTECTED]> wrote: yeah. This is something that triggers very rarely on certain boxes. Not fixed yet, and it's been around for some time. Is there anything you would like me to do to help diagnose this? -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.19.1-rt15: BUG in __tasklet_action at kernel/softirq.c:568
Almost exactly 24 hours after booting 2.6.19.1-rt15, I encountered the following: softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568 Call Trace: [] __WARN_ON+0x5c/0x74 [] __tasklet_action+0xae/0xf2 [] ksoftirqd+0xfc/0x198 [] ksoftirqd+0x0/0x198 [] kthread+0xd1/0x101 [] child_rip+0xa/0x12 [] kthread+0x0/0x101 [] child_rip+0x0/0x12 softirq-tasklet/36[CPU#2]: BUG in __tasklet_action at kernel/softirq.c:568 Call Trace: [] __WARN_ON+0x5c/0x74 [] __tasklet_action+0xae/0xf2 [] ksoftirqd+0xfc/0x198 [] ksoftirqd+0x0/0x198 [] kthread+0xd1/0x101 [] child_rip+0xa/0x12 [] kthread+0x0/0x101 [] child_rip+0x0/0x12 softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568 Call Trace: [] __WARN_ON+0x5c/0x74 [] __tasklet_action+0xae/0xf2 [] ksoftirqd+0xfc/0x198 [] ksoftirqd+0x0/0x198 [] kthread+0xd1/0x101 [] child_rip+0xa/0x12 [] kthread+0x0/0x101 [] child_rip+0x0/0x12 softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568 Call Trace: [] __WARN_ON+0x5c/0x74 [] __tasklet_action+0xae/0xf2 [] ksoftirqd+0xfc/0x198 [] ksoftirqd+0x0/0x198 [] kthread+0xd1/0x101 [] child_rip+0xa/0x12 [] kthread+0x0/0x101 [] child_rip+0x0/0x12 I had set the machine to do 1,000 kernel compiles the day before, but it might have been finished by then (the BUG triggered on a Saturday). I did this because it was kernel compiles that previously triggered a hard lockup on -rt kernels. The machine seems to still be usable, and the compiles all completed. The referenced line is: /* * After this point on the tasklet might be rescheduled * on another CPU, but it can only be added to another * CPU's tasklet list if we unlock the tasklet (which we * dont do yet). */ if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)) WARN_ON(1); This is a quad Opteron. Config attached. -- Robert Crocombe config_2.6.19.1-rt15 Description: Binary data
Re: realtime-preempt and arm
[EMAIL PROTECTED]:~$ uname -r 2.6.19.1-rt15_00 And I'm totally thrilled since this is the first -rt kernel that I've tried and been able to boot since .16-rt29. Yay! [EMAIL PROTECTED]:~$ zcat /proc/config.gz | egrep "HZ.*=y" CONFIG_HZ_1000=y 100 revs; min: 5008 max: 5034 avg: 5015 100 revs; min: 5008 max: 5023 avg: 5010 100 revs; min: 5008 max: 5015 avg: 5009 100 revs; min: 5008 max: 5018 avg: 5009 100 revs; min: 5008 max: 5017 avg: 5009 100 revs; min: 5008 max: 5015 avg: 5009 100 revs; min: 5008 max: 5016 avg: 5009 100 revs; min: 5008 max: 5017 avg: 5009 100 revs; min: 5008 max: 5014 avg: 5009 100 revs; min: 5008 max: 5016 avg: 5009 100 revs; min: 5008 max: 5015 avg: 5009 100 revs; min: 5008 max: 5017 avg: 5009 100 revs; min: 5008 max: 5016 avg: 5009 100 revs; min: 5008 max: 5023 avg: 5010 100 revs; min: 5008 max: 5015 avg: 5009 100 revs; min: 5008 max: 5016 avg: 5009 100 revs; min: 5008 max: 5015 avg: 5009 100 revs; min: 5008 max: 5016 avg: 5009 100 revs; min: 5008 max: 5015 avg: 5009 100 revs; min: 5008 max: 5017 avg: 5009 100 revs; min: 5008 max: 5016 avg: 5009 100 revs; min: 5008 max: 5016 avg: 5009 100 revs; min: 5008 max: 5017 avg: 5009 100 revs; min: 5008 max: 5018 avg: 5009 100 revs; min: 5008 max: 5019 avg: 5009 100 revs; min: 5008 max: 5013 avg: 5009 quad Opteron running x86_64 Fedora Core 5. -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: isochronous receives?
On 12/13/06, Stefan Richter <[EMAIL PROTECTED]> wrote: How about leaving ohci1394 as it is but document tag_mask better in libraw1394's inline doxygen(?) comments, and maybe add an enum or macros to be used as values of raw1394_iso_recv_start's tag_mask argument? /* can be ORed together */ #define RAW1394_IR_MATCH_TAG_0 1 #define RAW1394_IR_MATCH_TAG_1 2 #define RAW1394_IR_MATCH_TAG_2 4 #define RAW1394_IR_MATCH_TAG_3 8 #define RAW1394_IR_MATCH_ALL_TAGS -1 Yeah, that's definitely much better. I guess this would go in libraw1394's raw1394.h? Similar to: --- raw1394.h 2006-11-29 11:54:56.0 -0700 +++ raw1394_modified.h 2006-12-14 11:20:57.0 -0700 @@ -40,6 +40,14 @@ #define RAW1394_RCODE_TYPE_ERROR 0x6 #define RAW1394_RCODE_ADDRESS_ERROR 0x7 +/* can be ORed together */ +#define RAW1394_IR_MATCH_TAG_0 0x1 +#define RAW1394_IR_MATCH_TAG_1 0x2 +#define RAW1394_IR_MATCH_TAG_2 0x4 +#define RAW1394_IR_MATCH_TAG_3 0x8 +#define RAW1394_IR_MATCH_ALL_TAGS -1 +#define RAW1394_IR_MATCH_TAG(tag) (1 << (tag)) + typedef u_int8_t byte_t; typedef u_int32_t quadlet_t; typedef u_int64_t octlet_t; @@ -273,7 +281,9 @@ * @handle: libraw1394 handle * @start_on_cycle: isochronous cycle number on which to start * (-1 if you don't care) - * @tag_mask: mask of tag fields to match (-1 to receive all packets) + * @tag_mask: mask of tag fields to match. Use the RAW1394_IR_MATCH_* + * values for this rather than the literal tag bits: the values are not + * equivalent. * @sync: not used, reserved for future implementation * * Returns: 0 on success or -1 on failure (sets errno) ?? -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: isochronous receives?
On 11/29/06, Keith Curtis <[EMAIL PROTECTED]> wrote: I never resolved the problem. I turned on the excessive debugging output, but it didn't print out info about receiving packets or interrupts. My test app claimed there were no packets received although the bus analyzer showed lots of packets going by. Well, I figured it out, finally. Thankfully (in a way...), it was my code: I was setting the tag to -1 in a certain spot (which indicates that you want to see all packets, regardless of their tag), but unhelpfully changing it to 0 before calling raw1394_iso_recv_start... ...dangit, though. Looking at the data stream, the tag *is* zero. Stefan, isn't the line: /* match on specified tags */ contextMatch = tag_mask << 28; in ohci_iso_recv_start() wrong? The register looks to work like this. The tag field is two bits. if you want to match on 11b, then set bit tag3 (bit 31) if you want to match on 10b, then set bit tag2 (bit 30) if you want to match on 01b, then set bit tag1 (bit 29) if you want to match on 00b, then set bit tag0 (bit 28) Which makes the shift obviously wrong. Passing in '3' to match on tag 11b will have you instead set bits 29 and 28, and you will match on 01b and 00b. Passing in '0' will completely bone you: no bits will be turned on. Passing in '-1' to match all bits does work, though. You'd have to know to pass in 0x8 to match for tag 11b, which is a skosh counterintuitive and probably not what was intended. Here's my crap patch. It appears to Work For Me(tm). --- ohci1394.c 2006-12-04 16:52:10.916044780 -0700 +++ modified_ohci1394.c 2006-12-13 07:22:07.613917511 -0700 @@ -1491,7 +1491,18 @@ reg_write(recv->ohci, recv->ContextControlSet, command); /* match on specified tags */ - contextMatch = tag_mask << 28; +switch (tag_mask) +{ + case -1: contextMatch = tag_mask << 28; break; + case 0: contextMatch = (1 << 28); break; + case 1: contextMatch = (1 << 29); break; + case 2: contextMatch = (1 << 30); break; + case 3: contextMatch = (1 << 31); break; + default: + DBGMSG("Invalid tag_mask %0x, matching all tags",tag_mask); + contextMatch = tag_mask << 28; + break; + } if (iso->channel == -1) { /* enable multichannel reception */ So nevermind. I'm totally vindicated and my code is, as always, flawless. Cough. -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
isochronous receives?
Keith, et. al, I am having problems with isochronous receives, and remembered just as I was getting ready to dig into the source that there was a message about this stuff. Lo and behold your message to linux1394-user from September 7: I'm trying to receive isochronous streams (using libraw1394 1.2.0), and I've noticed that if data is transmitted on channel 63, then my app tends to work fine. If the stream is on a different channel, then I don't see any isochronous packets at all. I'm using 2.4.29, I've also tried 2.6.15 with similar results, can't seem to receive channels < 63. Did you ultimately have any success getting this going? Funnily enough, when I tested isochronous stuff in July, I just did iso transmit since I figured receives *must* be working since everyone has camcorders and whatnot. My currently my iso xmit stuff does appear to be working, but iso receives are not. I have a Firespy and no reason not to trust it, so I can see the junk I'm spewing out. I've tried transmitting on channels 4 and 63 (per your advice), but neither works for me. I suppose it could my stuff... nah. -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ieee1394: host adapter disappears on 1394 bus reset
On 11/27/06, Stefan Richter <[EMAIL PROTECTED]> wrote: Posted writes are still enabled. phys_dma=0 disables only the physical response unit. You have to change the source if you want to disable posted writes. See the top of ohci_initialize. Should this be a module load parameter too? Er. I misspoke. What I need is for write requests directed to address 0 to be directed to the asynchronous unit so that I can treat them as regular asynchronous write requests. As the OHCI 1.1 spec says: "Physical requests that are rejected by the PhysicalRequestFilter shall be sent to the AR Request DMA context if the AR Request DMA context is enabled". (5.14.2, page 58) That does appear to be happening: I have an ARM mapping set to begin at 0 and extend some ways along, and I do receive write requests. At first I was simply changing the lines: reg_write(ohci,OHCI1394_PhyReqFilterHiSet, 0x); reg_write(ohci,OHCI1394_PhyReqFilterLoSet, 0x); to be 0x instead, but then I paid more attention to the source and saw the phys_dma parameter, which does the same. Well, *did*, in 2.6.16. I see that 2.6.18 doesn't write 0 if !phys_dma, it just leaves the values alone, but I guess that's okay since they are set to 0 on reset. Same difference. So that's okay. Uhm, mostly. You should really see the horrors I have created in order to be able to have 5 hosts map the same address range (the custom protocol we're using doesn't use the destination address at all, so it's 0 for everybody). So long ways round, I think the phys_dma parameter is the proper thing for me. And I will try and do some actual thinking about what is happening. I was hoping to offload that work to you and simply perform mechanical changes to the source! Rats! -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ieee1394: host adapter disappears on 1394 bus reset
Robert Crocombe wrote: this is in 2.6.16-rt29 which has proved to be the easiest to provoke. I actually couldn't get 2.6.18 to break earlier this morning (few hundred resets). Okay, I got the problem to occur again with 2.6.18. I will attach my config in case you wish to scrutinize for any boneheadedness on my part. I provoked the problem both with and without the additional read of IntMaskSet. Amazingly, I lost host1 on the bus reset that occured after this sequence: rmmod ohci1394 rmmod ieee1394 make make modules_install modprobe ohci1394 which followed my adding the extra register read line. Here's the entirety of the host1 stuff (I did a s/.*host[^1].*//g in vim). I snipped some of the self ID chatter. Nov 27 13:06:35 spanky kernel: ieee1394: nodemgr and IRM functionality disabled Nov 27 13:06:35 spanky kernel: ohci1394: fw-host1: Remapped memory spaces reg 0xc2058000 Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Soft reset finished Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Iso contexts reg: 00a8 implemented: 000f Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Iso contexts reg: 0098 implemented: 00ff Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Receive DMA ctx=0 initialized Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Receive DMA ctx=0 initialized Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Transmit DMA ctx=0 initialized Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Transmit DMA ctx=1 initialized Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: physUpperBoundOffset= Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: OHCI-1394 1.1 (PCI): IRQ=[98] MMIO=[f9ffe000-f9ffe7ff] Max Packet=[4096] IR/IT contexts=[4/8] Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: IntEvent: 00020010 Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: irq_handler: Bus reset requested Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Cancel request received Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Got RQPkt interrupt status=0x8409 Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Single packet rcv'd Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: IntEvent: 0001 Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: SelfID interrupt received (phyid 1, not root) Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: SelfID packet 0x807fc494 received Nov 27 13:06:38 spanky kernel: ohci1394: fw-host1: SelfID packet 0x817fc494 received Nov 27 13:06:38 spanky kernel: ohci1394: fw-host1: SelfID for this node is 0x817fc494 Nov 27 13:06:39 spanky kernel: ohci1394: fw-host1: SelfID packet BLAH ...15 more SelfID... Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: SelfID complete Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: PhyReqFilter= Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: IntEventClear IntEventSet 04508000 IntMaskSet838301f3 Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: IntEvent: 00020010 Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: irq_handler: Bus reset requested Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Cancel request received Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Got RQPkt interrupt status=0x8409 Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Single packet rcv'd Nov 27 13:06:41 spanky kernel: ohci1394: fw-host1: IntEvent: 0001 Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID interrupt received (phyid 1, not root) Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet 0x807fc494 received Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet 0x817fc496 received Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID for this node is 0x817fc496 Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet BLAH ...15 more SelfID... Nov 27 13:06:43 spanky kernel: ohci1394: fw-host1: SelfID complete Nov 27 13:06:43 spanky kernel: ohci1394: fw-host1: PhyReqFilter= Nov 27 13:06:44 spanky kernel: ohci1394: fw-host1: IntEventClear IntEventSet 6ffdc33f IntMaskSet with the bad IntMaskSet again. I don't know if the host loss when I didn't have the additional read is meaningful, but there it is simply: Nov 27 13:04:39 spanky kernel: ohci1394: fw-host2: SelfID packet 0x823fc4f8 rf8c43f8c . . . Nov 27 13:06:30 spanky kernel: ohci1394: fw-host2: Soft reset finished with 2 minutes and ~30 bus resets in between. Oh, poop. I didn't mention that I have: options ieee1394 disable_nodemgr=1 options ohci1394 phys_dma=0 in my /etc/modprobe.conf. The Linux adapters are functioning as simulated peripherals to a piece of control hardware that always has a dest address of 0x on all packets so I needed to get rid of posted writes and any bickering over bus master. -- Robert Crocombe [EMAIL PROTECTED] 2.6.18_00_config.bz2 Description: BZip2 compressed data
Re: ieee1394: host adapter disappears on 1394 bus reset
On 11/27/06, Stefan Richter <[EMAIL PROTECTED]> wrote: But perhaps more importantly, how are the IRQs distributed? # cat /proc/interrupts This is almost right after boot. I generated about 40 bus resets just to stir things up a little: CPU0 CPU1 CPU2 CPU3 0: 33660 36393 30037 69980IO-APIC-edge timer 1: 0 0 1 10IO-APIC-edge i8042 8: 0 0 0 0IO-APIC-edge rtc 9: 0 0 0 0 IO-APIC-level acpi 12: 0 0 0113IO-APIC-edge i8042 15: 0270686215IO-APIC-edge ide1 50: 1 0 11567 7 IO-APIC-level aic79xx 58: 0 0 0 0 IO-APIC-level ehci_hcd:usb1 66: 0 0 0 0 IO-APIC-level ohci_hcd:usb2 74: 0 1 7 80 IO-APIC-level ohci1394, ohci1394 82: 7 23 30 28 IO-APIC-level ohci1394 90: 2 28 17 71 IO-APIC-level eth0 98: 9 27 21 9182 IO-APIC-level eth1 106: 19 17 20 26 IO-APIC-level ohci1394 114: 16 26 34 12 IO-APIC-level ohci1394 233: 0 0 15 0 IO-APIC-level aic79xx NMI:410 78 75 77 LOC: 166733 166657 166542 166432 ERR: 0 MIS: 0 Also: I couldn't cause the problem when using 4 Fireboard 800s through several hundred bus resets (usually took <= 40 for the Indigita card) Please add reg_read(ohci, OHCI1394_IntMaskSet); right before hpsb_selfid_complete(host, phyid, isroot);. This will flush the previous reg_write before hpsb_selfid_complete starts doing unspeakable things. Okay, so the code looks like this now: DBGMSG("PhyReqFilter=%08x%08x", reg_read(ohci,OHCI1394_PhyReqFilterHiSet), reg_read(ohci,OHCI1394_PhyReqFilterLoSet)); reg_read(ohci, OHCI1394_IntMaskSet); hpsb_selfid_complete(host, phyid, isroot); DBGMSG( "IntEventClear %08x " "IntEventSet %08x " "IntMaskSet %08x", reg_read(ohci, OHCI1394_IntEventClear), reg_read(ohci, OHCI1394_IntEventSet), reg_read(ohci, OHCI1394_IntMaskSet)); this is in 2.6.16-rt29 which has proved to be the easiest to provoke. I actually couldn't get 2.6.18 to break earlier this morning (few hundred resets). Okay, I've lost host1 (on the Indigita), but this time the last print statement is: Nov 27 10:38:27 spanky kernel: ohci1394: fw-host1: IntEventClear IntEventSet 04588000 IntMaskSet 818300f3 just like all the other hosts. I can confirm that no bus reset handlers are called, and there are another 4,000 lines of statements from the other hosts after the last from host1. -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] x86: unify/rewrite SMP TSC sync code
The difference that Wink reports is tiny compared to that measured on my Opteron machines: dual (2.6.17): [EMAIL PROTECTED]:cyclecounter_test$ ./rdtsc-pref 100 rdtsc: average ticks= 10 gtod:average ticks=4296 gtod_us: average ticks=4328 quad (2.6.16-rt29): [EMAIL PROTECTED]:wink_saville_test$ ./rdtsc-pref 100 rdtsc: average ticks= 10 gtod:average ticks=5688 gtod_us: average ticks=5711 I have my own little test that I'll attach, but it gives a similar result. Here are the results from the 2x box: [EMAIL PROTECTED]:cyclecounter_test$ ./timing Using the cycle counter Calibrated timer as 2593081969.758825 Hz 4194304 iterations in 0.016 seconds is 0.004 useconds per iteration. [EMAIL PROTECTED]:cyclecounter_test$ ./timing_gettimeofday Using gettimeofday 4194304 iterations in 6.793 seconds is 1.620 useconds per iteration. I have used the pthread affinity and/or cpuset, etc. mechanisms to try and inject some reliability into the measurement. Using gtod() can amount to a substantial disturbance of the thing to be measured. Using rdtsc, things seem reliable so far, and we have an FPGA (accessed through the PCI bus) that has been programmed to give access to an 8MHz clock and we do some checks against that. -- Robert Crocombe [EMAIL PROTECTED] #include // printf() #include // uint64_t #include // drand48() #include // select() #include// gettimeofday #include #include // rdtscll() // Globals enum { ITERATIONS = 1 << 22 }; static double seconds_per_tick; // Prototypes double gimme_timeofday(void); double get_time(void); void selectsleep(unsigned us); void init(void); // Definitions double gimme_timeofday(void) { struct timeval tv; gettimeofday(&tv, 0); return tv.tv_sec + 1e-6 * tv.tv_usec; } double get_time(void) { uint64_t t; rdtscll(t); return t * seconds_per_tick; } /** A good way to simply hang around doing nothing for awhile. */ void selectsleep(unsigned us) { struct timeval tv; tv.tv_sec = 0; tv.tv_usec = us; select(0,0,0,0,&tv); } /** Figure out how fast rdtscll() ticks. This should be equal to the frequency of the clock on the processor. Here's the bad news: I don't know if rdtscll() always uses the same processor so it may very well be necessary to set a processor affinity to get really good results over time. This piece of code by Mark Hahn from brain.mcmaster.ca/~hahn/. */ void init(void) { double sumx = 0; double sumy = 0; double sumxx = 0; double sumxy = 0; double slope; // least squares linear regression of ticks onto real time // as returned by gettimeofday. const unsigned n = 30; unsigned i; for ( unsigned int i = 0; i < n; ++i) { double breal,real,ticks; uint64_t aticks, bticks; breal = gimme_timeofday(); rdtscll(bticks); selectsleep((unsigned)(1 + drand48() * 20)); rdtscll(aticks); ticks = aticks - bticks; real = gimme_timeofday() - breal; sumx += real; sumxx += real * real; sumxy += real * ticks; sumy += ticks; } slope = ((sumxy - (sumx*sumy) / n) / (sumxx - (sumx*sumx) / n)); seconds_per_tick = 1.0 / slope; printf("Calibrated timer as %.6f Hz\n", slope); } int main(int argc, char *argv[]) { printf("Doing stuff\n"); #if 0 // Using rdtscll() printf("Using the cycle counter\n"); init(); double time_start = gimme_timeofday(); for (unsigned int i = 0; i < ITERATIONS; ++i) { double the_time; the_time = get_time(); } double time_end = gimme_timeofday(); #else // using gettimeofday() printf("Using gettimeofday\n"); double time_start = gimme_timeofday(); for (unsigned int i = 0; i < ITERATIONS; ++i) { double the_time; the_time = gimme_timeofday(); } double time_end = gimme_timeofday(); #endif double diff = time_end - time_start; double useconds = (diff / ITERATIONS) * 1e6; printf("%u iterations in %.3f seconds is %.3f useconds per iteration.\n", ITERATIONS, diff, useconds); printf("Done\n"); return 0; }
Re: ieee1394: host adapter disappears on 1394 bus reset
On 11/22/06, Stefan Richter <[EMAIL PROTECTED]> wrote: One thing you could try next is to add a debug logging macro which prints the contents of OHCI1394_IntEventClear, OHCI1394_IntEventSet, and OHCI1394_IntMaskSet, right after ohci1394's call to hpsb_selfid_complete. (I'm merely poking in the dark here.) I think you've got something! I managed to provoke failure from 3 of the 5 interfaces in a single burst of reset clicking! And yes, all 3 failed interfaces are on the Indigita card, and no, the Fireboard has never failed. The last thing I see from the failed interfaces is this: Nov 27 08:25:51 spanky kernel: ohci1394: fw-host3: PhyReqFilter= Nov 27 08:25:51 spanky kernel: ohci1394: fw-host3: IntEventClear IntEventSet 6ffdc33f IntMaskSet which looks very different from the entries by the interfaces that survive (these are the lines immediately before the one above) Nov 27 08:25:51 spanky kernel: ohci1394: fw-host4: IntEventClear IntEventSet 04508000 IntMaskSet 818300f3 Nov 27 08:25:51 spanky kernel: Nov 27 08:25:51 spanky kernel: ohci1394: fw-host2: IntEventClear IntEventSet 04508000 IntMaskSet 818300f3 Nov 27 08:25:51 spanky kernel: I'm not sure if this says anything to you except "hey, don't use those Indigita cards". The problem is, I can't get the number of ports I need using only Fireboards (I think I need 6, and I have 5 PCI slots but need to use some of the other slots). Is there further diagnostic poking about that I can do to narrow down the problem? Is something for Indigita? The card is pretty basic: 4 of the TI TSB82AA2 (Ice Lynx) links behind a IBM/Tundra PCI-X bridge. I have an Intel quad ethernet card that uses the exact same part (well, one rev older, actually). Here's a chunk of my lspci for completeness sake: 01:04.0 PCI bridge: IBM PCI-X to PCI-X Bridge (rev 03) 01:06.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) 02:04.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) 02:05.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) 02:06.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) 02:07.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) I will also try cramming a machine full of Fireboards and seeing if I can't get one of them to fail. -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/