Re: duplicate IP addressing

2009-10-21 Thread Pasi Kärkkäinen

On Tue, Oct 20, 2009 at 03:55:13PM -0400, Paul Cooper wrote:
 
 I have something interesting going on with VMWare (yea I know not open 
 source but I tried) VS ISCSI.
 I am getting a duplicate IP address message being reported on the 
 server that is provisioning the LUNS. not sure if it is a symptom or a 
 cause. any thoughts or anybody heard of this?


Well.. it sounds like you have the same IP in use on multiple machines.

That's bad.

-- Pasi


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: duplicate IP addressing

2009-10-21 Thread Paul Cooper
Pasi Kärkkäinen wrote:
 On Tue, Oct 20, 2009 at 03:55:13PM -0400, Paul Cooper wrote:
   
 I have something interesting going on with VMWare (yea I know not open 
 source but I tried) VS ISCSI.
 I am getting a duplicate IP address message being reported on the 
 server that is provisioning the LUNS. not sure if it is a symptom or a 
 cause. any thoughts or anybody heard of this?

 

 Well.. it sounds like you have the same IP in use on multiple machines.

 That's bad.

 -- Pasi


 

   
yes it does sound that way but I am not sure what the cause of duplicate 
ip addresses are...
regards
Paul

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: SCSI pass through command cause iscsi Conn error

2009-10-21 Thread Mike Christie

niko scsi wrote:
 I found initiator sent data twice ,check scsi_pass_through_err.cap
 file
 at frame 4 and frame 5 ,thank you very much !
 I wanna to look inside the code but don't know where to start .
 

Is the command you are trying to execute a bidirectional command. If so 
then the datasn is just off. It looks like we send a mode select, then 
the target sends a r2t (so the exp data sn is incremented to 1), then we 
send a data out, then the target sends a data in but the datasn is 0 
when it should be 1 (for bidi commands you have to take into account the 
  r2ts and data ins).

If you are not doing a bidi command, then I am not sure I have seen a 
data-in in this type of sequence. I think we normally see a data in with 
ok status in a read command. For your command it seems like we should 
have got a scsi cmd response pdu.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: 2 questions about log.c

2009-10-21 Thread Mike Christie

Erez Zilber wrote:
 I took a look at dolog()  log_flush(). Both use semop. If I
 understood the semop man page correctly, using a negative sem_op value
 means 'down' (i.e. enter a critical section). Using a positive sem_op
 value means 'up' (i.e. leave the critical section). According to that,
 it looks to me that the syslog calls in dolog()  log_flush() print
 incorrect information. Am I right?
 
 Another (bigger) problem - from time to time, when I run 'iscsiadm -m
 node -U all', it never returns. When I ran 'echo t 
 /proc/sysrq-trigger', I got the following:
 
 iscsidS  0  8441  1  8442 24234 
 (NOTLB)
 Oct 15 14:46:29 b73 kernel:  81012e28dd28 0086
  7fb18660
 Oct 15 14:46:29 b73 kernel:  81012e28de48 000a
 81003e3bd080 810143b85100
 Oct 15 14:46:29 b73 kernel:  912e761c6b46 2973
 81003e3bd268 0006800547f7
 Oct 15 14:46:29 b73 kernel: Call Trace:
 Oct 15 14:46:29 b73 kernel:  [8014b5d4] __next_cpu+0x19/0x28
 Oct 15 14:46:29 b73 kernel:  [8008bdf5] 
 find_busiest_group+0x20d/0x621
 Oct 15 14:46:29 b73 kernel:  [8011c267] sys_semtimedop+0x627/0x720
 Oct 15 14:46:29 b73 kernel:  [80063097] thread_return+0x62/0xfe
 Oct 15 14:46:29 b73 kernel:  [8004dd1b] lock_hrtimer_base+0x26/0x4c
 Oct 15 14:46:29 b73 kernel:  [8003a65a]
 hrtimer_try_to_cancel+0x4a/0x53
 Oct 15 14:46:29 b73 kernel:  [80059d69] hrtimer_cancel+0xc/0x16
 Oct 15 14:46:29 b73 kernel:  [80063db6] do_nanosleep+0x47/0x70
 Oct 15 14:46:29 b73 kernel:  [80059c56] hrtimer_nanosleep+0x58/0x118
 Oct 15 14:46:29 b73 kernel:  [8005d28d] tracesys+0xd5/0xe0
 Oct 15 14:46:29 b73 kernel:
 Oct 15 14:46:29 b73 kernel: iscsidS 800627ba 0
 8442  1 29016  8441 (NOTLB)
 Oct 15 14:46:29 b73 kernel:  81012fa65d28 0086
  8101a0282100
 Oct 15 14:46:29 b73 kernel:  81012fa65e10 000a
 81067683d040 81017b7fb080
 Oct 15 14:46:29 b73 kernel:  912e761c6007 28ca
 81067683d228 00018003d267
 Oct 15 14:46:29 b73 kernel: Call Trace:
 Oct 15 14:46:29 b73 kernel:  [8011c267] sys_semtimedop+0x627/0x720
 Oct 15 14:46:29 b73 kernel:  [80058e3a]
 inet_stream_connect+0x225/0x236
 Oct 15 14:46:29 b73 kernel:  [8021a0a8] sock_getsockopt+0x326/0x348
 Oct 15 14:46:29 b73 kernel:  [80032e39] lock_sock+0xa7/0xb2
 Oct 15 14:46:29 b73 kernel:  [80217b32] sys_connect+0x7e/0xae
 Oct 15 14:46:29 b73 kernel:  [8005d28d] tracesys+0xd5/0xe0
 
 It looks like both iscsid processes are waiting for a semaphore.
 
 Later, when I ran strace, I got the following logs (because semop was
 interrupted):
 
 Oct 15 14:53:28 b73 iscsid: semop up failed 4
 Oct 15 14:53:56 b73 iscsid: semop down failed
 Oct 15 14:54:27 b73 iscsid: semop up failed 4

Was on PTO. Looking into the above.


 
 BTW - why do we always have 2 iscsid processes?
 

The log writeout can wait for the data to be written. If we had one 
iscsid process and it was waiting for log data to be written on a iscsi 
disk, that has a iscsi connection problem we are stuck. iscsid would not 
be able to handle the connection error event and relogin since it is 
waiting for the data to be written. So one iscsid process handles just 
logging and the other handles iscsi events like login, logout, errors, etc.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Question about struct idbm

2009-10-21 Thread Mike Christie

Yangkook Kim wrote:
 Hi, I have a question about struct idbm.
 I want to know what the role of the structure in the codes of
 open-iscsi initiator.
 
 While reading codes of iscsiadm.c, idbm.c, etc, I see that struct idbm
 is passed to
 many functions. However, I found that these functions do not use the structure
 very much, and only a few members of the structure are used by these 
 functions.
 
 Guessing from the name, I understand that the structure would play central 
 role
 in the idbm database, but don't see where the role played.
 
 Can anybody explain to me what the role of the structure and
 the relationship with idbm database briefly?
 

What version of open-iscsi are you using? We only have struct idbm in 
idbm.c now.

It was used to access the iscsi db. The iscsi db has changed to ending 
up just a bunch of files in /etc/iscsi/, so the struct has been less 
usefull and is not mostly use for locking and referencing the the config 
file the db recrods are based on.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel Oops

2009-10-21 Thread Mike Christie

Kevin Ye wrote:
 Thanks Mike.
 
 I did the tests you mentioned a couple of times, and it didn't cause kernel
 oops.
 
 The kernel Oops I hit does not happen often. I hit twice in last 4 weeks.
 
 kernel patch is welcome and I will give it a try. Thanks.
 

Shoot, let me do some digging. I was hopping one of those manual 
commands would fire the problem. The one where you pull the cable 
yourself should have run over the same code and caused it.

Are you using multipath? If not, for now you can just disble nops/pings. 
Set the noop timeout and noop interval to 0 for every target you have 
setup, and set this in the iscsid.conf (you could also set it in 
iscsid.conf then rediscovery the targets so it will get picked up).



 Kevin
 
 On Thu, Oct 15, 2009 at 12:06 PM, Mike Christie micha...@cs.wisc.eduwrote:
 
 On 10/14/2009 05:11 PM, Kevin Ye wrote:
 Hi All,

 We hit the kernel oops again on our setup. Any suggestion to fix that?
 If you just login then logout manually

 iscsiadm -m session -u

 Does that cause an oops?


 If you log back in, then pull the network cable, wait to see the ping
 timeout messages then manually logout

 iscsiadm -m session -u

 Does that cause an oops?


 Can you rebuild your kernel, if I send you a patch?



 Thanks.

 Our set up is:
 kernel: 2.6.24-24
 open-iscsi: 2.0-870.3

 kernel logs:
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.697051] scsi841 : iSCSI
 Initiator
 over TCP/IP
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.962031] scsi 841:0:0:201:
 Direct-Access IET  VIRTUAL-DISK 0PQ: 0 ANSI: 4
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.969109] sd 841:0:0:201: [sdd]
 4505472 512-byte hardware sectors (2307 MB)
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.973314] sd 841:0:0:201: [sdd]
 Write
 Protect is off
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.973320] sd 841:0:0:201: [sdd]
 Mode
 Sense: 77 00 00 08
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.975420] sd 841:0:0:201: [sdd]
 Write
 cache: disabled, read cache: disabled, doesn't support DPO or FUA
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.977468] sd 841:0:0:201: [sdd]
 4505472 512-byte hardware sectors (2307 MB)
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.977938] sd 841:0:0:201: [sdd]
 Write
 Protect is off
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.977944] sd 841:0:0:201: [sdd]
 Mode
 Sense: 77 00 00 08
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.981749] sd 841:0:0:201: [sdd]
 Write
 cache: disabled, read cache: disabled, doesn't support DPO or FUA
 Oct  9 21:15:50 ian_ser_2 kernel: [28466.981761]  sdd: sdd1
 Oct  9 21:15:50 ian_ser_2 kernel: [28467.027801] sd 841:0:0:201: [sdd]
 Attached SCSI disk
 Oct  9 21:15:50 ian_ser_2 kernel: [28467.027886] sd 841:0:0:201: Attached
 scsi generic sg4 type 0
 Oct  9 21:16:01 ian_ser_2 kernel: [28477.713280]  connection626:0: ping
 timeout of 15 secs expired, last rx 7049831, last ping 7052331, now
 7056081
 Oct  9 21:16:01 ian_ser_2 kernel: [28477.713467]  connection626:0:
 detected
 conn error (1011)
 Oct  9 21:16:01 ian_ser_2 kernel: [28477.717268]  connection627:0: ping
 timeout of 15 secs expired, last rx 7049832, last ping 7052332, now
 7056082
 Oct  9 21:16:01 ian_ser_2 kernel: [28477.717458]  connection627:0:
 detected
 conn error (1011)
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.049414] BUG: unable to handle
 kernel NULL pointer dereference at virtual address 0060
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.049639] printing eip: e08a212a
 *pde
 = 
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.049924] Oops:  [#1] SMP
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.050100] Modules linked in:
 iscsi_tcp libiscsi scsi_transport_iscsi iscsi_trgt crc32c libcrc32c
 nls_iso8859_1 nls_cp437 vfat fat vmmemctl cpufreq_conservative
 cpufreq_ondemand cpufreq_userspace cpufreq_stats freq_table
 cpufreq_powersave sbs video output sbshc dock battery iptable_filter
 ip_tables x_tables vmhgfs lp loop ipv6 container serio_raw ac button
 evdev
 parport_pc parport i2c_piix4 i2c_core intel_agp agpgart shpchp
 pci_hotplug
 psmouse pcspkr ext3 jbd mbcache sd_mod sg sr_mod cdrom pata_acpi
 ata_generic
 floppy pcnet32 mii mptspi mptscsih mptbase scsi_transport_spi ata_piix
 libata scsi_mod raid10 raid456 async_xor async_memcpy async_tx xor raid1
 raid0 multipath linear md_mod dm_mirror dm_snapshot dm_mod thermal
 processor
 fan fbcon tileblit font bitblit softcursor fuse vmxnet
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.051174]

 Oct  9 21:16:10 ian_ser_2 kernel: [28486.051174]
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.051286] Pid: 16444, comm:
 iscsi_scan_839 Not tainted (2.6.24-24-generic #1)
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.051433] EIP: 0060:[e08a212a]
 EFLAGS: 00010202 CPU: 0
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.052073] EIP is at
 spi_device_match+0x1a/0x60 [scsi_transport_spi]
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.052178] EAX:  EBX:
 c27ff0b0
 ECX: c27ff000 EDX: c27ff0b0
 Oct  9 21:16:10 ian_ser_2 kernel: [28486.052274] ESI: c27ff0b0 EDI:
 d0c31800
 EBP: c0286000 ESP: