[ewg] RE: [ofa-general] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread John Russo
Sorry Jeff... I am playing middle man with another engineer here for this 
information...

The query does not have to be per QP, but it does need to be per IB HCA port to 
IB HCA port communication path.

For example, if a node has 64 CPUs, it could do the query once per each other 
node on behalf of the 64 processes.

Its still an N^2 set of queries, but at least N can be reduced to be the number 
of end node IB ports as opposed to the number of processes.



-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com] 
Sent: Monday, January 05, 2009 12:07 PM
To: John Russo
Cc: Tziporet Koren; ewg@lists.openfabrics.org; gene...@lists.openfabrics.org
Subject: Re: [ofa-general] RE: Agenda for the OFED meeting today (Jan 5, 09)

Hmm.  Perhaps I'm not grokking your answer -- did you answer my  
question?  I'm indirectly asking about scalability of the SM to have  
hundreds/thousands of MPI processes simultaneously querying the SM.


On Jan 5, 2009, at 11:47 AM, John Russo wrote:

 When using routing algorithms such as lash, the SL used per end to  
 end connection will vary based on the route.  lash uses multiple VLs  
 to avoid credit loops.  As such the SL reported will vary based on  
 fabric topology and which pair of end nodes the path is being  
 requested on behalf of.

 -Original Message-
 From: Jeff Squyres [mailto:jsquy...@cisco.com]
 Sent: Monday, January 05, 2009 11:39 AM
 To: John Russo
 Cc: Tziporet Koren; ewg@lists.openfabrics.org; gene...@lists.openfabrics.org
 Subject: Re: [ofa-general] RE: Agenda for the OFED meeting today  
 (Jan 5, 09)

 Would all MPI processes need to query the SM for each path that they
 want to use in a QP?


 On Jan 5, 2009, at 11:31 AM, John Russo wrote:

 Another suggestion for 1.5

 Implementation of SA queries for Path Records (using IBTA 1.2.1
 ServiceId field) in all OFED ULPs, especially for MPI
The IBTA standard defines that the proper way to
 establish a connection is to get a PathRecord from the SM/SA and use
 it to define all the attributes of the communication path.
 Ideally the IBTA CM should then be used to establish the connection
 and QPs as well.

 At present, openmpi, mvapich1 and mvapich2 do not use PathRecords,
 but instead hard code attributes like the PKey, SL, etc.
 In some cases these hardcoded values can be overridden by
 configurable values such as PKey and SL, but such values must be
 uniform across all connections and must be provided per job (which
 can be error prone/tedious).

At present opensm supports PKeys and SLs, however MPI
 cannot easily use these features.
 Other features, such as lash routing, in opensm do not work properly
 with MPI because the SL must be uniform across all connections, but
 for lash it will vary per route.

 Additionally, applications which do not use PathRecords will have
 difficulties with advanced features like IB routing, partitioning,
 etc.  All of which are available or being worked on in opensm.

 From: ewg-boun...@lists.openfabrics.org 
 [mailto:ewg-boun...@lists.openfabrics.org
 ] On Behalf Of Tziporet Koren
 Sent: Monday, January 05, 2009 1:00 AM
 To: ewg@lists.openfabrics.org
 Cc: gene...@lists.openfabrics.org
 Subject: [ewg] Agenda for the OFED meeting today (Jan 5, 09)


 Hello all,

 I hope we all had nice holidays and vacations, and now it's the time
 to get back to business.

 Agenda for OFED meeting today:

 1. Conclusions from OFED 1.4 release: Open discussion

 2. Do we wish to have OFED 1.4.: Please send pros  cons before the
 meeting

 3. OFED 1.5: Schedule and features.

 This is what we presented in SC08 about 1.5:

 Preliminary Schedule:

  * Feature Freeze: 3/20/09
  * Alpha Release: 3/20/09
  * Beta Release: 4/20/09
  * RC1: 5/5/09
  * RC2-RCx: About every 2 weeks as needed
  * Release: June 2009
 Features:

  * Kernel.org: 2.6.28 and 2.6.29
  * Multiple Event Queues to support Multi-core CPUs
  * NFS/RDMA - GA
  * RDS support for iWARP
  * OpenMPI 1.3
  * Add support/backports for RedHat EL 5.3 and EL 4.8, SLES 11
  * Support for Mellanox vNIC (EoIB) and FCoIB with BridgeX device
  * more TBD...

 We also presented the OS matrix but I suggest we will close this in
 the next meeting.

 My proposal:

  * Have the release in July and not June - so we will have more time
 for development
  * Stick to one kernel version base and not change in the middle
 since we saw that changing the kernel base caused a delay.
 We need to decide in the meeting if it is 2.6.29 or we should wait
 for 2.6.30.
  * Add IB over Eth - this is similar to iWARP but more like IB (e.g.
 including UD), and can work over ConnectX.

 Please send your suggestions to the list before the meeting if
 possible

 Tziporet


 ___
 general mailing list
 gene...@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

 To unsubscribe, please 

[ewg] OFED teleconference

2009-01-05 Thread Jeff Squyres (jsquyres)
BEGIN:VCALENDAR
METHOD:REQUEST
PRODID:Microsoft CDO for Microsoft Exchange
VERSION:2.0
BEGIN:VTIMEZONE
TZID:GMT -0500 (Standard) / GMT -0400 (Daylight)
BEGIN:STANDARD
DTSTART:16010101T02
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
RRULE:FREQ=YEARLY;WKST=MO;INTERVAL=1;BYMONTH=11;BYDAY=1SU
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:16010101T02
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
RRULE:FREQ=YEARLY;WKST=MO;INTERVAL=1;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20090105T112645Z
DTSTART;TZID=GMT -0500 (Standard) / GMT -0400 (Daylight):20090105T12
SUMMARY:OFED teleconference
UID:04008200E00074C5B7101A82E00820EF8587FE6EC901000
 01A58D9270244CD4980609EC7FDFD1B5F
ATTENDEE;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=TRUE;CN=e...@lists
 .openfabrics.org:MAILTO:ewg@lists.openfabrics.org
ORGANIZER;CN=Jeff Squyres (jsquyres):MAILTO:jsquy...@cisco.com
LOCATION:ID: 210020028
DTEND;TZID=GMT -0500 (Standard) / GMT -0400 (Daylight):20090105T13
RRULE:FREQ=WEEKLY;UNTIL=20090119T17Z;INTERVAL=2;BYDAY=MO;WKST=MO
DESCRIPTION:\N_
 _\NJeffrey Squyres has invited you to a Cisco Unified Meet
 ingPlace Conference\N\NDate/Time:   JAN 5\, 2009 at 12:00PM Am
 erica/New_York\NLength:  60\NFrequency:   2\NM
 eeting ID:  210020028\NMeeting Password:\N\NGlobal Acc
 ess Numbers:\Nhttp://cisco.com/en/US/about/doing_business/conferencing/ind
 ex.html\N\NSan Jose\, CA: +1.408.525.6800  RTP:  +1.919.39
 2.3330  \NUS/Canada:  +1.866.432.9903United Kingdom:   +44.20.
 8824.0117\NIndia:  +91.80.4103.3979   Germany:  +49.619.67
 73.9002\NJapan:  +81.3.5763.9394China:+86.10.8515.
 5666\N\NTO ATTEND A WEB AND VOICE CONFERENCE:\N\NCISCO INTRANET ATTENDEES\
 NJoin the Web  Voice Conference*\N1. Go to http://meetingplaceinternal.ci
 sco.com/join.asp?210020028\N2. Enter your CEC User ID  Password then clic
 k OK\N- Accept any security warnings you receive and wait for the Meeting 
 Room to initialize\N3. Click on CONNECT from the Meeting Room to join the 
 Voice Conference portion of the meeting\N\NEXTERNAL ATTENDEES - Outside th
 e Cisco Intranet**\NJoin the Web  Voice Conference*\N1. Go to http://meet
 ingplace.cisco.com/join.asp?210020028\N2. Fill in the My Name is field the
 n click Attend Meeting\N- If you have a CEC User ID\, click on the Cisco i
 con\N- Accept any security warnings you receive and wait for the Meeting R
 oom to initialize\N3. Click on CONNECT from the Meeting Room to join the V
 oice Conference portion of the meeting\N- Note: Guest users will see a lin
 k to the Global Access Numbers.\N\N*If this is your first time attending a
  Web Conference\, disable any pop-up blockers and visit http://meetingplac
 e.cisco.com/mpweb/scripts/browsertestupper.asp to test your web browser fo
 r compatibility with the Web Conference.\N**Not all meetings are scheduled
  to allow external attendees into the Web Conference portion of the meetin
 g\, if the URL does not work\, please follow the Voice only Conference ins
 tructions below to attend.\N\NTO ATTEND A VOICE ONLY CONFERENCE\N1. Dial i
 nto Cisco Unified MeetingPlace (view the Access Numbers and link above)\N2
 . Press 1 to attend the meeting\N3. Follow the prompts to enter the Meetin
 g ID 210020028 and join the meeting\N\NSUPPORT\NInformation about this Con
 ference: Contact Jeffrey Squyres\, 914085250971\NCisco IT Support Center: 
 Attend the Voice Conference and then press #0 on your phone keypad\N\N
 __
 _\N\N
SEQUENCE:0
PRIORITY:5
CLASS:
CREATED:20090105T112648Z
LAST-MODIFIED:20090105T112649Z
STATUS:CONFIRMED
TRANSP:OPAQUE
X-MICROSOFT-CDO-BUSYSTATUS:BUSY
X-MICROSOFT-CDO-INSTTYPE:1
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
X-MICROSOFT-CDO-ALLDAYEVENT:FALSE
X-MICROSOFT-CDO-IMPORTANCE:1
X-MICROSOFT-CDO-OWNERAPPTID:-62580775
X-MICROSOFT-CDO-APPT-SEQUENCE:0
X-MICROSOFT-CDO-ATTENDEE-CRITICAL-CHANGE:20090105T112645Z
X-MICROSOFT-CDO-OWNER-CRITICAL-CHANGE:20090105T112645Z
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:REMINDER
TRIGGER;RELATED=START:-PT00H15M00S
END:VALARM
END:VEVENT
END:VCALENDAR
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] OFED teleconf outlook invites

2009-01-05 Thread Jeff Squyres
An outlook invite is coming shortly for an OFED teleconference today  
and Jan 19th, both at the usual times.


*** Please do not send me the replies (accept, tentative, decline).   
Thanks.


--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [ofa-general] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread John Russo
When using routing algorithms such as lash, the SL used per end to end 
connection will vary based on the route.  lash uses multiple VLs to avoid 
credit loops.  As such the SL reported will vary based on fabric topology and 
which pair of end nodes the path is being requested on behalf of.

-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com] 
Sent: Monday, January 05, 2009 11:39 AM
To: John Russo
Cc: Tziporet Koren; ewg@lists.openfabrics.org; gene...@lists.openfabrics.org
Subject: Re: [ofa-general] RE: Agenda for the OFED meeting today (Jan 5, 09)

Would all MPI processes need to query the SM for each path that they  
want to use in a QP?


On Jan 5, 2009, at 11:31 AM, John Russo wrote:

 Another suggestion for 1.5

 Implementation of SA queries for Path Records (using IBTA 1.2.1  
 ServiceId field) in all OFED ULPs, especially for MPI
 The IBTA standard defines that the proper way to  
 establish a connection is to get a PathRecord from the SM/SA and use  
 it to define all the attributes of the communication path.
 Ideally the IBTA CM should then be used to establish the connection  
 and QPs as well.

 At present, openmpi, mvapich1 and mvapich2 do not use PathRecords,  
 but instead hard code attributes like the PKey, SL, etc.
 In some cases these hardcoded values can be overridden by  
 configurable values such as PKey and SL, but such values must be  
 uniform across all connections and must be provided per job (which  
 can be error prone/tedious).

 At present opensm supports PKeys and SLs, however MPI  
 cannot easily use these features.
 Other features, such as lash routing, in opensm do not work properly  
 with MPI because the SL must be uniform across all connections, but  
 for lash it will vary per route.

 Additionally, applications which do not use PathRecords will have  
 difficulties with advanced features like IB routing, partitioning,  
 etc.  All of which are available or being worked on in opensm.

 From: ewg-boun...@lists.openfabrics.org 
 [mailto:ewg-boun...@lists.openfabrics.org 
 ] On Behalf Of Tziporet Koren
 Sent: Monday, January 05, 2009 1:00 AM
 To: ewg@lists.openfabrics.org
 Cc: gene...@lists.openfabrics.org
 Subject: [ewg] Agenda for the OFED meeting today (Jan 5, 09)


 Hello all,

 I hope we all had nice holidays and vacations, and now it's the time  
 to get back to business.

 Agenda for OFED meeting today:

 1. Conclusions from OFED 1.4 release: Open discussion

 2. Do we wish to have OFED 1.4.: Please send pros  cons before the  
 meeting

 3. OFED 1.5: Schedule and features.

 This is what we presented in SC08 about 1.5:

 Preliminary Schedule:

   * Feature Freeze: 3/20/09
   * Alpha Release: 3/20/09
   * Beta Release: 4/20/09
   * RC1: 5/5/09
   * RC2-RCx: About every 2 weeks as needed
   * Release: June 2009
 Features:

   * Kernel.org: 2.6.28 and 2.6.29
   * Multiple Event Queues to support Multi-core CPUs
   * NFS/RDMA - GA
   * RDS support for iWARP
   * OpenMPI 1.3
   * Add support/backports for RedHat EL 5.3 and EL 4.8, SLES 11
   * Support for Mellanox vNIC (EoIB) and FCoIB with BridgeX device
   * more TBD...

 We also presented the OS matrix but I suggest we will close this in  
 the next meeting.

 My proposal:

   * Have the release in July and not June - so we will have more time  
 for development
   * Stick to one kernel version base and not change in the middle  
 since we saw that changing the kernel base caused a delay.
 We need to decide in the meeting if it is 2.6.29 or we should wait  
 for 2.6.30.
   * Add IB over Eth - this is similar to iWARP but more like IB (e.g.  
 including UD), and can work over ConnectX.

 Please send your suggestions to the list before the meeting if  
 possible

 Tziporet


 ___
 general mailing list
 gene...@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


-- 
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread Jeff Squyres
Hmm.  Perhaps I'm not grokking your answer -- did you answer my  
question?  I'm indirectly asking about scalability of the SM to have  
hundreds/thousands of MPI processes simultaneously querying the SM.



On Jan 5, 2009, at 11:47 AM, John Russo wrote:

When using routing algorithms such as lash, the SL used per end to  
end connection will vary based on the route.  lash uses multiple VLs  
to avoid credit loops.  As such the SL reported will vary based on  
fabric topology and which pair of end nodes the path is being  
requested on behalf of.


-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Monday, January 05, 2009 11:39 AM
To: John Russo
Cc: Tziporet Koren; ewg@lists.openfabrics.org; gene...@lists.openfabrics.org
Subject: Re: [ofa-general] RE: Agenda for the OFED meeting today  
(Jan 5, 09)


Would all MPI processes need to query the SM for each path that they
want to use in a QP?


On Jan 5, 2009, at 11:31 AM, John Russo wrote:


Another suggestion for 1.5

Implementation of SA queries for Path Records (using IBTA 1.2.1
ServiceId field) in all OFED ULPs, especially for MPI
   The IBTA standard defines that the proper way to
establish a connection is to get a PathRecord from the SM/SA and use
it to define all the attributes of the communication path.
Ideally the IBTA CM should then be used to establish the connection
and QPs as well.

At present, openmpi, mvapich1 and mvapich2 do not use PathRecords,
but instead hard code attributes like the PKey, SL, etc.
In some cases these hardcoded values can be overridden by
configurable values such as PKey and SL, but such values must be
uniform across all connections and must be provided per job (which
can be error prone/tedious).

   At present opensm supports PKeys and SLs, however MPI
cannot easily use these features.
Other features, such as lash routing, in opensm do not work properly
with MPI because the SL must be uniform across all connections, but
for lash it will vary per route.

Additionally, applications which do not use PathRecords will have
difficulties with advanced features like IB routing, partitioning,
etc.  All of which are available or being worked on in opensm.

From: ewg-boun...@lists.openfabrics.org 
[mailto:ewg-boun...@lists.openfabrics.org
] On Behalf Of Tziporet Koren
Sent: Monday, January 05, 2009 1:00 AM
To: ewg@lists.openfabrics.org
Cc: gene...@lists.openfabrics.org
Subject: [ewg] Agenda for the OFED meeting today (Jan 5, 09)


Hello all,

I hope we all had nice holidays and vacations, and now it's the time
to get back to business.

Agenda for OFED meeting today:

1. Conclusions from OFED 1.4 release: Open discussion

2. Do we wish to have OFED 1.4.: Please send pros  cons before the
meeting

3. OFED 1.5: Schedule and features.

This is what we presented in SC08 about 1.5:

Preliminary Schedule:

* Feature Freeze: 3/20/09
* Alpha Release: 3/20/09
* Beta Release: 4/20/09
* RC1: 5/5/09
* RC2-RCx: About every 2 weeks as needed
* Release: June 2009
Features:

* Kernel.org: 2.6.28 and 2.6.29
* Multiple Event Queues to support Multi-core CPUs
* NFS/RDMA - GA
* RDS support for iWARP
* OpenMPI 1.3
* Add support/backports for RedHat EL 5.3 and EL 4.8, SLES 11
* Support for Mellanox vNIC (EoIB) and FCoIB with BridgeX device
* more TBD...

We also presented the OS matrix but I suggest we will close this in
the next meeting.

My proposal:

* Have the release in July and not June - so we will have more time
for development
* Stick to one kernel version base and not change in the middle
since we saw that changing the kernel base caused a delay.
We need to decide in the meeting if it is 2.6.29 or we should wait
for 2.6.30.
* Add IB over Eth - this is similar to iWARP but more like IB (e.g.
including UD), and can work over ConnectX.

Please send your suggestions to the list before the meeting if
possible

Tziporet


___
general mailing list
gene...@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



--
Jeff Squyres
Cisco Systems




--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread Davis, Arlin R

There are scaling issues with SA path-record queries. We attempted to be good 
citizens with Intel MPI using the rdma_cm agent (via uDAPL) but was forced to 
build hard-coded RC QP support in OFED 1.4 (uDAPL scm) to avoid the many 
scaling and configuration problems that came with IPoIB requirements, ARP 
storms, rdma_cm timers, and SA path record query/caching.
If someone wants to sign up to design and implement a scalable SA query caching 
agent we would be happy to look at path record queries again.

-arlin

Another suggestion for 1.5

Implementation of SA queries for Path Records (using IBTA 1.2.1 ServiceId 
field) in all OFED ULPs, especially for MPI
The IBTA standard defines that the proper way to establish a 
connection is to get a PathRecord from the SM/SA and use it to define all the 
attributes of the communication path.
Ideally the IBTA CM should then be used to establish the connection and QPs as 
well.

At present, openmpi, mvapich1 and mvapich2 do not use PathRecords, but instead 
hard code attributes like the PKey, SL, etc.
In some cases these hardcoded values can be overridden by configurable values 
such as PKey and SL, but such values must be uniform across all connections and 
must be provided per job (which can be error prone/tedious).

At present opensm supports PKeys and SLs, however MPI cannot easily 
use these features.
Other features, such as lash routing, in opensm do not work properly with MPI 
because the SL must be uniform across all connections, but for lash it will 
vary per route.

Additionally, applications which do not use PathRecords will have difficulties 
with advanced features like IB routing, partitioning, etc.  All of which are 
available or being worked on in opensm.


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread Jeff Squyres
Would all MPI processes need to query the SM for each path that they  
want to use in a QP?



On Jan 5, 2009, at 11:31 AM, John Russo wrote:


Another suggestion for 1.5

Implementation of SA queries for Path Records (using IBTA 1.2.1  
ServiceId field) in all OFED ULPs, especially for MPI
The IBTA standard defines that the proper way to  
establish a connection is to get a PathRecord from the SM/SA and use  
it to define all the attributes of the communication path.
Ideally the IBTA CM should then be used to establish the connection  
and QPs as well.


At present, openmpi, mvapich1 and mvapich2 do not use PathRecords,  
but instead hard code attributes like the PKey, SL, etc.
In some cases these hardcoded values can be overridden by  
configurable values such as PKey and SL, but such values must be  
uniform across all connections and must be provided per job (which  
can be error prone/tedious).


At present opensm supports PKeys and SLs, however MPI  
cannot easily use these features.
Other features, such as lash routing, in opensm do not work properly  
with MPI because the SL must be uniform across all connections, but  
for lash it will vary per route.


Additionally, applications which do not use PathRecords will have  
difficulties with advanced features like IB routing, partitioning,  
etc.  All of which are available or being worked on in opensm.


From: ewg-boun...@lists.openfabrics.org [mailto:ewg-boun...@lists.openfabrics.org 
] On Behalf Of Tziporet Koren

Sent: Monday, January 05, 2009 1:00 AM
To: ewg@lists.openfabrics.org
Cc: gene...@lists.openfabrics.org
Subject: [ewg] Agenda for the OFED meeting today (Jan 5, 09)


Hello all,

I hope we all had nice holidays and vacations, and now it’s the time  
to get back to business.


Agenda for OFED meeting today:

1. Conclusions from OFED 1.4 release: Open discussion

2. Do we wish to have OFED 1.4.: Please send pros  cons before the  
meeting


3. OFED 1.5: Schedule and features.

This is what we presented in SC08 about 1.5:

Preliminary Schedule:

• Feature Freeze: 3/20/09
• Alpha Release: 3/20/09
• Beta Release: 4/20/09
• RC1: 5/5/09
• RC2-RCx: About every 2 weeks as needed
• Release: June 2009
Features:

• Kernel.org: 2.6.28 and 2.6.29
• Multiple Event Queues to support Multi-core CPUs
• NFS/RDMA – GA
• RDS support for iWARP
• OpenMPI 1.3
• Add support/backports for RedHat EL 5.3 and EL 4.8, SLES 11
• Support for Mellanox vNIC (EoIB) and FCoIB with BridgeX device
• more TBD…

We also presented the OS matrix but I suggest we will close this in  
the next meeting.


My proposal:

	• Have the release in July and not June - so we will have more time  
for development
	• Stick to one kernel version base and not change in the middle  
since we saw that changing the kernel base caused a delay.
We need to decide in the meeting if it is 2.6.29 or we should wait  
for 2.6.30.
	• Add IB over Eth - this is similar to iWARP but more like IB (e.g.  
including UD), and can work over ConnectX.


Please send your suggestions to the list before the meeting if  
possible


Tziporet


___
general mailing list
gene...@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread Jeff Squyres
I chatted with John and Todd from QL on the phone today -- we  
basically came to the same conclusion:


- need to beef-up opensm to be able to scalably handle lots of  
incoming path record lookups
- need to beef-up the CM clients on the host (maybe; this work might  
already be done?)
- need to see the current status of the SA caching stuff / re-open  
that discussion to see if the work can be completed, etc.


It might also be worthwhile to start a whole new discussion about  
making a better CM (at least from the ULP perspective).  One that  
offers simple mechanisms for those who don't need/care about the  
details, but also offers complex/detailed mechanisms (perhaps  
remarkably like today's mechanisms).




On Jan 5, 2009, at 2:01 PM, Davis, Arlin R wrote:



There are scaling issues with SA path-record queries. We attempted  
to be good citizens with Intel MPI using the rdma_cm agent (via  
uDAPL) but was forced to build hard-coded RC QP support in OFED 1.4  
(uDAPL scm) to avoid the many scaling and configuration problems  
that came with IPoIB requirements, ARP storms, rdma_cm timers, and  
SA path record query/caching.
If someone wants to sign up to design and implement a scalable SA  
query caching agent we would be happy to look at path record queries  
again.


-arlin

Another suggestion for 1.5

Implementation of SA queries for Path Records (using IBTA 1.2.1  
ServiceId field) in all OFED ULPs, especially for MPI
The IBTA standard defines that the proper way to  
establish a connection is to get a PathRecord from the SM/SA and use  
it to define all the attributes of the communication path.
Ideally the IBTA CM should then be used to establish the connection  
and QPs as well.


At present, openmpi, mvapich1 and mvapich2 do not use PathRecords,  
but instead hard code attributes like the PKey, SL, etc.
In some cases these hardcoded values can be overridden by  
configurable values such as PKey and SL, but such values must be  
uniform across all connections and must be provided per job (which  
can be error prone/tedious).


At present opensm supports PKeys and SLs, however MPI  
cannot easily use these features.
Other features, such as lash routing, in opensm do not work properly  
with MPI because the SL must be uniform across all connections, but  
for lash it will vary per route.


Additionally, applications which do not use PathRecords will have  
difficulties with advanced features like IB routing, partitioning,  
etc.  All of which are available or being worked on in opensm.



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread John Russo
Another suggestion for 1.5

Implementation of SA queries for Path Records (using IBTA 1.2.1 ServiceId 
field) in all OFED ULPs, especially for MPI
The IBTA standard defines that the proper way to establish a 
connection is to get a PathRecord from the SM/SA and use it to define all the 
attributes of the communication path.
Ideally the IBTA CM should then be used to establish the connection and QPs as 
well.

At present, openmpi, mvapich1 and mvapich2 do not use PathRecords, but instead 
hard code attributes like the PKey, SL, etc.
In some cases these hardcoded values can be overridden by configurable values 
such as PKey and SL, but such values must be uniform across all connections and 
must be provided per job (which can be error prone/tedious).

At present opensm supports PKeys and SLs, however MPI cannot easily 
use these features.
Other features, such as lash routing, in opensm do not work properly with MPI 
because the SL must be uniform across all connections, but for lash it will 
vary per route.

Additionally, applications which do not use PathRecords will have difficulties 
with advanced features like IB routing, partitioning, etc.  All of which are 
available or being worked on in opensm.


From: ewg-boun...@lists.openfabrics.org 
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren
Sent: Monday, January 05, 2009 1:00 AM
To: ewg@lists.openfabrics.org
Cc: gene...@lists.openfabrics.org
Subject: [ewg] Agenda for the OFED meeting today (Jan 5, 09)



Hello all,

I hope we all had nice holidays and vacations, and now it's the time to get 
back to business.

Agenda for OFED meeting today:

1. Conclusions from OFED 1.4 release: Open discussion

2. Do we wish to have OFED 1.4.: Please send pros  cons before the meeting

3. OFED 1.5: Schedule and features.

This is what we presented in SC08 about 1.5:

Preliminary Schedule:

 *   Feature Freeze: 3/20/09

 *   Alpha Release: 3/20/09

 *   Beta Release: 4/20/09

 *   RC1: 5/5/09

 *   RC2-RCx: About every 2 weeks as needed

 *   Release: June 2009

Features:

 *   Kernel.org: 2.6.28 and 2.6.29

 *   Multiple Event Queues to support Multi-core CPUs

 *   NFS/RDMA - GA

 *   RDS support for iWARP

 *   OpenMPI 1.3

 *   Add support/backports for RedHat EL 5.3 and EL 4.8, SLES 11

 *   Support for Mellanox vNIC (EoIB) and FCoIB with BridgeX device

 *   more TBD...


We also presented the OS matrix but I suggest we will close this in the next 
meeting.

My proposal:

 *   Have the release in July and not June - so we will have more time for 
development

 *   Stick to one kernel version base and not change in the middle since we saw 
that changing the kernel base caused a delay.
We need to decide in the meeting if it is 2.6.29 or we should wait for 2.6.30.

 *   Add IB over Eth - this is similar to iWARP but more like IB (e.g. 
including UD), and can work over ConnectX.


Please send your suggestions to the list before the meeting if possible

Tziporet

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread Hal Rosenstock
Jeff,

On Mon, Jan 5, 2009 at 3:26 PM, Jeff Squyres jsquy...@cisco.com wrote:
 I chatted with John and Todd from QL on the phone today -- we basically came
 to the same conclusion:

 - need to beef-up opensm to be able to scalably handle lots of incoming path
 record lookups

This is the most obvious SA scalability issue but there are some
others which may be important (related to SA caching rather than SA
distribution as an approach).

 - need to beef-up the CM clients on the host (maybe; this work might already
 be done?)
 - need to see the current status of the SA caching stuff / re-open that
 discussion to see if the work can be completed, etc.

IMO this will aggravate other SA scalability issues as well as there
being other limitations with this approach.

Don't get me wrong; I'm all for improving the SA scalability; there's
no quick solution to this AFAIK.

It would be interesting to see an apples to apples comparison of
OpenSM and proprietary SMs in terms of running on the same hardware
and the transaction rate for various things.

I think this warrants an open discussion if people are serious about
working on this issue.

 It might also be worthwhile to start a whole new discussion about making a
 better CM (at least from the ULP perspective). One that offers simple
 mechanisms for those who don't need/care about the details, but also offers
 complex/detailed mechanisms (perhaps remarkably like today's mechanisms).

I've heard similar comments before but this too will take significant
where-with-all IMO.

-- Hal
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] RE: Agenda for the OFED meeting today (Jan 5, 09)

2009-01-05 Thread Jeff Squyres

On Jan 5, 2009, at 5:16 PM, Hal Rosenstock wrote:


Don't get me wrong; I'm all for improving the SA scalability; there's
no quick solution to this AFAIK.

It would be interesting to see an apples to apples comparison of
OpenSM and proprietary SMs in terms of running on the same hardware
and the transaction rate for various things.

I think this warrants an open discussion if people are serious about
working on this issue.


Agreed.  I agree that this set of issues has come up many times before  
on the list; it will be interesting to see if anyone will *do*  
anything about it this time.  :-)


(obviously, I'm only interested as a consumer of the end result)

It might also be worthwhile to start a whole new discussion about  
making a

better CM (at least from the ULP perspective). One that offers simple
mechanisms for those who don't need/care about the details, but  
also offers
complex/detailed mechanisms (perhaps remarkably like today's  
mechanisms).


I've heard similar comments before but this too will take significant
where-with-all IMO.



Ditto my above remarks.  :-)

--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg