Re: duplicate IP addressing
On Tue, Oct 20, 2009 at 03:55:13PM -0400, Paul Cooper wrote: I have something interesting going on with VMWare (yea I know not open source but I tried) VS ISCSI. I am getting a duplicate IP address message being reported on the server that is provisioning the LUNS. not sure if it is a symptom or a cause. any thoughts or anybody heard of this? Well.. it sounds like you have the same IP in use on multiple machines. That's bad. -- Pasi --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: duplicate IP addressing
Pasi Kärkkäinen wrote: On Tue, Oct 20, 2009 at 03:55:13PM -0400, Paul Cooper wrote: I have something interesting going on with VMWare (yea I know not open source but I tried) VS ISCSI. I am getting a duplicate IP address message being reported on the server that is provisioning the LUNS. not sure if it is a symptom or a cause. any thoughts or anybody heard of this? Well.. it sounds like you have the same IP in use on multiple machines. That's bad. -- Pasi yes it does sound that way but I am not sure what the cause of duplicate ip addresses are... regards Paul --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: SCSI pass through command cause iscsi Conn error
niko scsi wrote: I found initiator sent data twice ,check scsi_pass_through_err.cap file at frame 4 and frame 5 ,thank you very much ! I wanna to look inside the code but don't know where to start . Is the command you are trying to execute a bidirectional command. If so then the datasn is just off. It looks like we send a mode select, then the target sends a r2t (so the exp data sn is incremented to 1), then we send a data out, then the target sends a data in but the datasn is 0 when it should be 1 (for bidi commands you have to take into account the r2ts and data ins). If you are not doing a bidi command, then I am not sure I have seen a data-in in this type of sequence. I think we normally see a data in with ok status in a read command. For your command it seems like we should have got a scsi cmd response pdu. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: 2 questions about log.c
Erez Zilber wrote: I took a look at dolog() log_flush(). Both use semop. If I understood the semop man page correctly, using a negative sem_op value means 'down' (i.e. enter a critical section). Using a positive sem_op value means 'up' (i.e. leave the critical section). According to that, it looks to me that the syslog calls in dolog() log_flush() print incorrect information. Am I right? Another (bigger) problem - from time to time, when I run 'iscsiadm -m node -U all', it never returns. When I ran 'echo t /proc/sysrq-trigger', I got the following: iscsidS 0 8441 1 8442 24234 (NOTLB) Oct 15 14:46:29 b73 kernel: 81012e28dd28 0086 7fb18660 Oct 15 14:46:29 b73 kernel: 81012e28de48 000a 81003e3bd080 810143b85100 Oct 15 14:46:29 b73 kernel: 912e761c6b46 2973 81003e3bd268 0006800547f7 Oct 15 14:46:29 b73 kernel: Call Trace: Oct 15 14:46:29 b73 kernel: [8014b5d4] __next_cpu+0x19/0x28 Oct 15 14:46:29 b73 kernel: [8008bdf5] find_busiest_group+0x20d/0x621 Oct 15 14:46:29 b73 kernel: [8011c267] sys_semtimedop+0x627/0x720 Oct 15 14:46:29 b73 kernel: [80063097] thread_return+0x62/0xfe Oct 15 14:46:29 b73 kernel: [8004dd1b] lock_hrtimer_base+0x26/0x4c Oct 15 14:46:29 b73 kernel: [8003a65a] hrtimer_try_to_cancel+0x4a/0x53 Oct 15 14:46:29 b73 kernel: [80059d69] hrtimer_cancel+0xc/0x16 Oct 15 14:46:29 b73 kernel: [80063db6] do_nanosleep+0x47/0x70 Oct 15 14:46:29 b73 kernel: [80059c56] hrtimer_nanosleep+0x58/0x118 Oct 15 14:46:29 b73 kernel: [8005d28d] tracesys+0xd5/0xe0 Oct 15 14:46:29 b73 kernel: Oct 15 14:46:29 b73 kernel: iscsidS 800627ba 0 8442 1 29016 8441 (NOTLB) Oct 15 14:46:29 b73 kernel: 81012fa65d28 0086 8101a0282100 Oct 15 14:46:29 b73 kernel: 81012fa65e10 000a 81067683d040 81017b7fb080 Oct 15 14:46:29 b73 kernel: 912e761c6007 28ca 81067683d228 00018003d267 Oct 15 14:46:29 b73 kernel: Call Trace: Oct 15 14:46:29 b73 kernel: [8011c267] sys_semtimedop+0x627/0x720 Oct 15 14:46:29 b73 kernel: [80058e3a] inet_stream_connect+0x225/0x236 Oct 15 14:46:29 b73 kernel: [8021a0a8] sock_getsockopt+0x326/0x348 Oct 15 14:46:29 b73 kernel: [80032e39] lock_sock+0xa7/0xb2 Oct 15 14:46:29 b73 kernel: [80217b32] sys_connect+0x7e/0xae Oct 15 14:46:29 b73 kernel: [8005d28d] tracesys+0xd5/0xe0 It looks like both iscsid processes are waiting for a semaphore. Later, when I ran strace, I got the following logs (because semop was interrupted): Oct 15 14:53:28 b73 iscsid: semop up failed 4 Oct 15 14:53:56 b73 iscsid: semop down failed Oct 15 14:54:27 b73 iscsid: semop up failed 4 Was on PTO. Looking into the above. BTW - why do we always have 2 iscsid processes? The log writeout can wait for the data to be written. If we had one iscsid process and it was waiting for log data to be written on a iscsi disk, that has a iscsi connection problem we are stuck. iscsid would not be able to handle the connection error event and relogin since it is waiting for the data to be written. So one iscsid process handles just logging and the other handles iscsi events like login, logout, errors, etc. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Question about struct idbm
Yangkook Kim wrote: Hi, I have a question about struct idbm. I want to know what the role of the structure in the codes of open-iscsi initiator. While reading codes of iscsiadm.c, idbm.c, etc, I see that struct idbm is passed to many functions. However, I found that these functions do not use the structure very much, and only a few members of the structure are used by these functions. Guessing from the name, I understand that the structure would play central role in the idbm database, but don't see where the role played. Can anybody explain to me what the role of the structure and the relationship with idbm database briefly? What version of open-iscsi are you using? We only have struct idbm in idbm.c now. It was used to access the iscsi db. The iscsi db has changed to ending up just a bunch of files in /etc/iscsi/, so the struct has been less usefull and is not mostly use for locking and referencing the the config file the db recrods are based on. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Kernel Oops
Kevin Ye wrote: Thanks Mike. I did the tests you mentioned a couple of times, and it didn't cause kernel oops. The kernel Oops I hit does not happen often. I hit twice in last 4 weeks. kernel patch is welcome and I will give it a try. Thanks. Shoot, let me do some digging. I was hopping one of those manual commands would fire the problem. The one where you pull the cable yourself should have run over the same code and caused it. Are you using multipath? If not, for now you can just disble nops/pings. Set the noop timeout and noop interval to 0 for every target you have setup, and set this in the iscsid.conf (you could also set it in iscsid.conf then rediscovery the targets so it will get picked up). Kevin On Thu, Oct 15, 2009 at 12:06 PM, Mike Christie micha...@cs.wisc.eduwrote: On 10/14/2009 05:11 PM, Kevin Ye wrote: Hi All, We hit the kernel oops again on our setup. Any suggestion to fix that? If you just login then logout manually iscsiadm -m session -u Does that cause an oops? If you log back in, then pull the network cable, wait to see the ping timeout messages then manually logout iscsiadm -m session -u Does that cause an oops? Can you rebuild your kernel, if I send you a patch? Thanks. Our set up is: kernel: 2.6.24-24 open-iscsi: 2.0-870.3 kernel logs: Oct 9 21:15:50 ian_ser_2 kernel: [28466.697051] scsi841 : iSCSI Initiator over TCP/IP Oct 9 21:15:50 ian_ser_2 kernel: [28466.962031] scsi 841:0:0:201: Direct-Access IET VIRTUAL-DISK 0PQ: 0 ANSI: 4 Oct 9 21:15:50 ian_ser_2 kernel: [28466.969109] sd 841:0:0:201: [sdd] 4505472 512-byte hardware sectors (2307 MB) Oct 9 21:15:50 ian_ser_2 kernel: [28466.973314] sd 841:0:0:201: [sdd] Write Protect is off Oct 9 21:15:50 ian_ser_2 kernel: [28466.973320] sd 841:0:0:201: [sdd] Mode Sense: 77 00 00 08 Oct 9 21:15:50 ian_ser_2 kernel: [28466.975420] sd 841:0:0:201: [sdd] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA Oct 9 21:15:50 ian_ser_2 kernel: [28466.977468] sd 841:0:0:201: [sdd] 4505472 512-byte hardware sectors (2307 MB) Oct 9 21:15:50 ian_ser_2 kernel: [28466.977938] sd 841:0:0:201: [sdd] Write Protect is off Oct 9 21:15:50 ian_ser_2 kernel: [28466.977944] sd 841:0:0:201: [sdd] Mode Sense: 77 00 00 08 Oct 9 21:15:50 ian_ser_2 kernel: [28466.981749] sd 841:0:0:201: [sdd] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA Oct 9 21:15:50 ian_ser_2 kernel: [28466.981761] sdd: sdd1 Oct 9 21:15:50 ian_ser_2 kernel: [28467.027801] sd 841:0:0:201: [sdd] Attached SCSI disk Oct 9 21:15:50 ian_ser_2 kernel: [28467.027886] sd 841:0:0:201: Attached scsi generic sg4 type 0 Oct 9 21:16:01 ian_ser_2 kernel: [28477.713280] connection626:0: ping timeout of 15 secs expired, last rx 7049831, last ping 7052331, now 7056081 Oct 9 21:16:01 ian_ser_2 kernel: [28477.713467] connection626:0: detected conn error (1011) Oct 9 21:16:01 ian_ser_2 kernel: [28477.717268] connection627:0: ping timeout of 15 secs expired, last rx 7049832, last ping 7052332, now 7056082 Oct 9 21:16:01 ian_ser_2 kernel: [28477.717458] connection627:0: detected conn error (1011) Oct 9 21:16:10 ian_ser_2 kernel: [28486.049414] BUG: unable to handle kernel NULL pointer dereference at virtual address 0060 Oct 9 21:16:10 ian_ser_2 kernel: [28486.049639] printing eip: e08a212a *pde = Oct 9 21:16:10 ian_ser_2 kernel: [28486.049924] Oops: [#1] SMP Oct 9 21:16:10 ian_ser_2 kernel: [28486.050100] Modules linked in: iscsi_tcp libiscsi scsi_transport_iscsi iscsi_trgt crc32c libcrc32c nls_iso8859_1 nls_cp437 vfat fat vmmemctl cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_stats freq_table cpufreq_powersave sbs video output sbshc dock battery iptable_filter ip_tables x_tables vmhgfs lp loop ipv6 container serio_raw ac button evdev parport_pc parport i2c_piix4 i2c_core intel_agp agpgart shpchp pci_hotplug psmouse pcspkr ext3 jbd mbcache sd_mod sg sr_mod cdrom pata_acpi ata_generic floppy pcnet32 mii mptspi mptscsih mptbase scsi_transport_spi ata_piix libata scsi_mod raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse vmxnet Oct 9 21:16:10 ian_ser_2 kernel: [28486.051174] Oct 9 21:16:10 ian_ser_2 kernel: [28486.051174] Oct 9 21:16:10 ian_ser_2 kernel: [28486.051286] Pid: 16444, comm: iscsi_scan_839 Not tainted (2.6.24-24-generic #1) Oct 9 21:16:10 ian_ser_2 kernel: [28486.051433] EIP: 0060:[e08a212a] EFLAGS: 00010202 CPU: 0 Oct 9 21:16:10 ian_ser_2 kernel: [28486.052073] EIP is at spi_device_match+0x1a/0x60 [scsi_transport_spi] Oct 9 21:16:10 ian_ser_2 kernel: [28486.052178] EAX: EBX: c27ff0b0 ECX: c27ff000 EDX: c27ff0b0 Oct 9 21:16:10 ian_ser_2 kernel: [28486.052274] ESI: c27ff0b0 EDI: d0c31800 EBP: c0286000 ESP: