[ofa-general] Re: QoS management in OpenSM - doc
Sasha Khapyorsky wrote: Hi Yevgeny, On 16:42 Wed 27 Feb , Yevgeny Kliteynik wrote: The following doc describes QoS management in OpenSM. This doc (named QoS_management_in_OpenSM.txt) has been added to the OFED docs, along with the QoS_in_OFED.txt. I'd like to add this info to OpenSM man pages as well. Yes, I think that it could be useful to have it under opensm/doc too. I'm including the text here as is, so it will be easier to follow possible changes. When those will be done, I'll fix the format to match the OpenSM man pages and post a patch. The only problem is that the whole OpenSM man has ~850 lines, while this QoS management file has ~500 lines... :) I would suggest to have some basic part (50-100 lines) included in the man page and reference an entire document (under opensm/doc) for more details. OK, I can prepare some kind of summary that would go into the man page. However, this means that a user would no be able to define a QoS policy just from reading an OpenSM man pages - he will HAVE to check the full doc under opensm/doc. Please review. Looks fine, few tiny nits are below. [snip...] == 4. Policy File Syntax Guidelines == - Empty lines are ignored. It is mentioned on the next line too. Right - Leading and trailing blanks, as well as empty lines, are ignored, so the indentation in the example is just for better readability. - Comments are started with the pound sign (#) and terminated by EOL. - Any keyword should be the first non-blank in the line, unless it's a comment. - Keywords that denote section/subsection start have matching closing keywords. - Having a QoS Level named DEFAULT is a must - it is applied to PR/MPR requests that didn't match any of the matching rules. - Any section/subsection of the policy file is optional. [snip...] == 6. Simplified QoS Policy - Details and Examples == Simplified QoS policy match rules are tailored for matching ULPs (or some application on top of a ULP) PR/MPR requests. This section has a list of per-ULP (or per-application) match rules and the SL that should be enforced on the matched PR/MPR query. Match rules include: - Default match rule that is applied to PR/MPR query that didn't match any of the other match rules - SDP - SDP application with a specific target TCP/IP port range - SRP with a specific target IB port GUID - RDS - iSER - iSER application with a specific target TCP/IP port range - IPoIB with a default PKey - IPoIB with a specific PKey - any ULP/application with a specific Service ID in the PR/MPR query - any ULP/application with a specific PKey in the PR/MPR query - any ULP/application with a specific target IB port GUID in the PR/MPR query Since any section of the policy file is optional, as long as basic rules of the file are kept (such as no referring to nonexisting port group, having default QoS Level, etc), the simplified policy section (qos-ulps) can serve as a complete QoS policy file. The shortest policy file in this case would be as follows: qos-ulps default : 0 #default SL end-qos-ulps It is equivalent to the previous example of the shortest policy file, and it is also equivalent to not having policy file at all. Below is an example of simplified QoS policy with all the possible keywords: qos-ulps default : 0 # default SL sdp, port-num 3 : 0 # SL for application running on top # of SDP when a destination # TCP/IPport is 3 sdp, port-num 1-2 : 0 sdp : 1 # default SL for any other # application running on top of SDP rds : 2 # SL for RDS traffic iser, port-num 900: 0 # SL for iSER with a specific target # port iser : 3 # default SL for iSER ipoib, pkey 0x0001: 0 # SL for IPoIB on partition with # pkey 0x0001 ipoib : 4 # default IPoIB partition, # pkey=0x7FFF any, service-id 0x6234: 6 # match any PR/MPR query with a # specific Service ID any, pkey 0x0ABC : 6 # match any PR/MPR query with a # specific PKey srp, target-port-guid 0x1234 : 5 # SRP when SRP Target is located on #
[ofa-general] MUTUAL TRUST
Attn: It is indeed my pleasure to write to you this letter, which I believe will be a surprise, met on he net we are both complete strangers. As you read this, I don't want you to feel sorry for me, because I believe everyone will die someday. My name is Gordon Anthony, a former oil merchant in the middle east. I have been diagnosed with Esophageal cancer which was discovered very late, due to my laxity in caring for my health. It has defiled all forms of medicine, and right now I have only about a few months to live, according to medical experts. I have not particularly lived my life so well, as I never really cared for anyone not even myself but my business. Though I am very rich, I was never generous, I was always hostile to people and only focus on my business as that was the only thing I cared for, but now I regret all this as I now know that there is more to life than just wanting to have or make all the money in the world. I believe when I have a second chance to come to this world I would live my life a different way from how I had lived it, now that it is dark for me, I have willed and given most of my properties and assets to my immediate and extended family members and as well as a few close friends. To correct my wrong past life, I have decided to give alms to charity organizations, as I want this to be one of the last good deeds I do on earth. So far, I have distributed money to some charity organizations in the U.A.E, Algeria and Malaysia. Now that my health has deteriorated so badly, I cannot do this my self anymore. I once asked members of my family to close one of my accounts and distribute the money which I have there to charity organization in Bulgaria, India and Pakistan, they refused and kept the money to themselves. Hence, I do not trust them anymore, as they seem not to be contended with what I have left for them. The last of my money which no one knows of is the huge cash deposit of Six million dollars that I have with a Fiducially Company. I will want you to help me collect this deposit and dispatched it to charity organizations and you must be sending me information's of how it was disbursed by email. I have set aside 20% for you for your time and patience. Thanks. Eng. Gordon Anthony. -- ALICE C'EST ENCORE MIEUX AVEC LA MUSIQUE ! Découvrez vite l'offre exclusive ALICE BOX avec ALICE MUSIC, le téléchargement légal et illimité de plus de 300 000 titres ! En cliquant ici http://alicemusic.aliceadsl.fr Offre soumise à conditions ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] ofa_1_3_kernel 20080302-0200 daily build status
This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.19 Passed on powerpc with linux-2.6.12 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.15 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.13 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Failed: ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [OpenSM] updn routing performance fix???
On Sat, 2008-03-01 at 22:53 +, Sasha Khapyorsky wrote: On 19:59 Fri 29 Feb , Hal Rosenstock wrote: If that makes sense, then also query commands on this state would likely also. Not sure about this. It is dynamically updated flag, so it would be hard to catch a valid value by hand from the OpenSM console. I was referring to the balance state not that flag. Does that make more sense ? -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [OpenSM] updn routing performance fix???
On 05:04 Sun 02 Mar , Hal Rosenstock wrote: On Sat, 2008-03-01 at 22:53 +, Sasha Khapyorsky wrote: On 19:59 Fri 29 Feb , Hal Rosenstock wrote: If that makes sense, then also query commands on this state would likely also. Not sure about this. It is dynamically updated flag, so it would be hard to catch a valid value by hand from the OpenSM console. I was referring to the balance state not that flag. Does that make more sense ? What do you mean? Routing dumps? Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] page allocation failure
Look like your system is low on memory. Maybe you have to add memory or maybe something eats your memory (a memory leak?). On Thu, 2008-02-28 at 18:42 +0100, Bernd Schubert wrote: Hello, on several on our Lustre Servers we can see page allocation failures. This is with 2.6.22 + kernel modules from ofed 1.2.5 [44464.764559] Lustre: 24052:0:(ldlm_lib.c:698:target_handle_connect()) Skipped 16 previous similar messages [54132.351263] ib_cm/2: page allocation failure. order:0, mode:0x10d0 [54132.360738] [54132.360741] Call Trace: [54132.367803] [8020ac61] show_trace+0x34/0x47 [54132.373235] [8020ac86] dump_stack+0x12/0x17 [54132.378937] [80251bc4] __alloc_pages+0x2a3/0x2bc [54132.386180] [8020f75c] dma_alloc_pages+0x9b/0xbf [54132.395120] [8020f7f6] dma_alloc_coherent+0x76/0x1cc [54132.401651] [8809af1e] :ib_mthca:mthca_buf_alloc+0x1bd/0x2a3 [54132.408897] [8809f9a9] :ib_mthca:mthca_alloc_qp_common+0x246/0x4e5 [54132.418884] [880a0c6d] :ib_mthca:mthca_alloc_qp+0xab/0x102 [54132.425774] [880a5217] :ib_mthca:mthca_create_qp+0x126/0x281 [54132.432716] [88054bc5] :ib_core:ib_create_qp+0x17/0x91 [54132.439102] [88161c9f] :rdma_cm:rdma_create_qp+0x2d/0x153 [54132.446301] [8835d0cc] :ko2iblnd:kiblnd_create_conn+0x81c/0x1250 [54132.456992] [88365295] :ko2iblnd:kiblnd_passive_connect+0x605/0xdd0 [54132.469847] [88366975] :ko2iblnd:kiblnd_cm_callback+0x255/0xeb0 [54132.478821] [881620e7] :rdma_cm:cma_req_handler+0x322/0x389 [54132.485637] [88155fa4] :ib_cm:cm_process_work+0x17/0xad [54132.492182] [88157025] :ib_cm:cm_req_handler+0x7ae/0x81b [54132.499236] [881570bf] :ib_cm:cm_work_handler+0x2d/0xbaa [54132.506690] [80236291] run_workqueue+0x7f/0x10b [54132.512652] [80236b1a] worker_thread+0xda/0xe4 [54132.520136] [8023959a] kthread+0x47/0x75 [54132.525570] [8020a2f8] child_rip+0xa/0x12 [54132.532975] [54132.535527] Mem-info: [54132.538157] Node 0 DMA per-cpu: [54132.542303] CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.551752] CPU1: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.561661] CPU2: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.571154] CPU3: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.580597] CPU4: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.592354] CPU5: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.601794] CPU6: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.610719] CPU7: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [54132.619630] Node 0 DMA32 per-cpu: [54132.623551] CPU0: Hot: hi: 186, btch: 31 usd: 49 Cold: hi: 62, btch: 15 usd: 49 [54132.632691] CPU1: Hot: hi: 186, btch: 31 usd: 26 Cold: hi: 62, btch: 15 usd: 3 [54132.642680] CPU2: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 54 [54132.651897] CPU3: Hot: hi: 186, btch: 31 usd: 1 Cold: hi: 62, btch: 15 usd: 13 [54132.663321] CPU4: Hot: hi: 186, btch: 31 usd: 43 Cold: hi: 62, btch: 15 usd: 55 [54132.673282] CPU5: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 49 [54132.683636] CPU6: Hot: hi: 186, btch: 31 usd: 25 Cold: hi: 62, btch: 15 usd: 1 [54132.693156] CPU7: Hot: hi: 186, btch: 31 usd: 13 Cold: hi: 62, btch: 15 usd: 56 [54132.703412] Node 0 Normal per-cpu: [54132.707024] CPU0: Hot: hi: 186, btch: 31 usd: 130 Cold: hi: 62, btch: 15 usd: 14 [54132.719317] CPU1: Hot: hi: 186, btch: 31 usd: 81 Cold: hi: 62, btch: 15 usd: 1 [54132.729276] CPU2: Hot: hi: 186, btch: 31 usd: 134 Cold: hi: 62, btch: 15 usd: 2 [54132.738819] CPU3: Hot: hi: 186, btch: 31 usd: 124 Cold: hi: 62, btch: 15 usd: 8 [54132.748078] CPU4: Hot: hi: 186, btch: 31 usd: 21 Cold: hi: 62, btch: 15 usd: 4 [54132.758029] CPU5: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 9 [54132.766855] CPU6: Hot: hi: 186, btch: 31 usd: 120 Cold: hi: 62, btch: 15 usd: 13 [54132.776462] CPU7: Hot: hi: 186, btch: 31 usd: 166 Cold: hi: 62, btch: 15 usd: 12 [54132.786009] Active:28507 inactive:62701 dirty:8386 writeback:27 unstable:0 [54132.786010] free:5586 slab:273528 mapped:2136 pagetables:699 bounce:0 [54132.803082] Node 0 DMA free:11192kB min:20kB low:24kB high:28kB active:0kB inactive:0kB present:10660kB pages_scanned:0 all_unreclaimable? yes [54132.816507] lowmem_reserve[]: 0 3255 4013 [54132.820811] Node 0 DMA32 free:9812kB min:6564kB low:8204kB high:9844kB
Re: [ofa-general] Re: [OpenSM] updn routing performance fix???
Hey Hal, Are you saying a flag inside each osm_switch_t to indicate if that specific switch is balanced? The script I wrote for the balance check did have difficulty determining a lot of corner cases (is port connected to a CA? is it active? what ports are up vs. down links, etc.). At the end of the day you just output a lot of extra info and have to look through it manually. Although probably not easy as a whole, these calculations would be easier in opensm since that information is available. Al On 05:04 Sun 02 Mar , Hal Rosenstock wrote: On Sat, 2008-03-01 at 22:53 +, Sasha Khapyorsky wrote: On 19:59 Fri 29 Feb , Hal Rosenstock wrote: If that makes sense, then also query commands on this state would likely also. Not sure about this. It is dynamically updated flag, so it would be hard to catch a valid value by hand from the OpenSM console. I was referring to the balance state not that flag. Does that make more sense ? What do you mean? Routing dumps? Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [PATCH] opensm: enforce routing paths rebalancing on switch reconnection
Hey Sasha, In order to make things work, I also had to add this patch. Seems like a corner case that needs to be handled since we never fall into __osm_pi_rcv_process_switch_port(). (BTW, I am working off a 3.1.10 branch for the test cluster, so this patch is forward ported and technically untested.) --- a/opensm/opensm/osm_port_info_rcv.c +++ b/opensm/opensm/osm_port_info_rcv.c @@ -564,6 +564,7 @@ void osm_pi_rcv_process(IN void *context, IN void *data) , Commencing heavy sweep\n, cl_ntoh64(node_guid), cl_ntoh64(port_guid)); sm-p_subn-force_heavy_sweep = 1; + sm-p_subn-ignore_existing_lfts = 1; goto Exit; } Al Hey Sasha, This patch should definitely work. I'll let you know after I get a chance to try it. Al Hi Al, On 16:08 Sat 01 Mar , Sasha Khapyorsky wrote: When switch ports were reconnected we need to recalculate routing paths balancing. Reconnection is detected by port state examination - when it becomes INIT routing paths rebalancing (ignore_existing_lfts flag) is enforced. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] This patch is simpler than all previous ones. I tested it with ibsim already. Could you test in your environment? Sasha -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [PATCH] mmu notifiers #v8 + xpmem
Here an example of the futher orthogonal work to do on top of #v8 during .26-rc to make the whole mmu notifier API sleep capable. 1) Every single ptep_clear_flush_young_notify and ptep_clear_flush_notify must be converted like the below. The below is the conversion of a single one. do_wp_page has been converted by Christoph already but with invalidate_range (should be changed to invalidate_page by releasing the refcount on the page after calling invalidate_page). Hope it's clear why I'd rather not depend on these changes to be merged in .25 in order to have the mmu notifier included in .25. 2) Then after all this conversion work is finished, it's trivial to delete ptep_clear_flush_young_notify and ptep_clear_flush_notify from mmu_notifier.h (they will be unused macros once the conversion is complete). 3) After that the VM has to be changed to convert anon_vma lock and i_mmap_lock spinlocks to mutex/rwsemaphore. 4) Then finally the mmu_notifier_unregister must be dropped to make the mmu notifier sleep capable with RCU in the mmu_notifier() fast path. It's unclear at this point if 3/4 should be switchable and happening under a CONFIG_XPMEM or similar or if everyone will benefit from those spinlock becoming mutex (the only one that is certain to appreciate such a change is preempt-rt, the rest of the userbase I don't know for sure and I'd be more confortable with a TPC number comparison before doing such a chance by default, but I leave the commentary on such a change to linux-mm in a separate thread). Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/mm/rmap.c b/mm/rmap.c --- a/mm/rmap.c +++ b/mm/rmap.c @@ -274,7 +274,7 @@ static int page_referenced_one(struct pa unsigned long address; pte_t *pte; spinlock_t *ptl; - int referenced = 0; + int referenced = 0, clear_flush_young = 0; address = vma_address(page, vma); if (address == -EFAULT) @@ -287,8 +287,11 @@ static int page_referenced_one(struct pa if (vma-vm_flags VM_LOCKED) { referenced++; *mapcount = 1; /* break early from loop */ - } else if (ptep_clear_flush_young_notify(vma, address, pte)) - referenced++; + } else { + clear_flush_young = 1; + if (ptep_clear_flush_young(vma, address, pte)) + referenced++; + } /* Pretend the page is referenced if the task has the swap token and is in the middle of a page fault. */ @@ -298,6 +301,11 @@ static int page_referenced_one(struct pa (*mapcount)--; pte_unmap_unlock(pte, ptl); + + if (clear_flush_young) + referenced += mmu_notifier_clear_flush_young(vma-vm_mm, +address); + out: return referenced; } ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [infiniband-diags] check_lft_balance script
Hey Sasha, Here's the script I mentioned before that I used for the balance checking earlier. Its nothing fancy but probably could be useful to others. Al -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From 110a19a2d1bdaafe1ace7b2c48f39be5c1ec388f Mon Sep 17 00:00:00 2001 From: Albert L. Chu [EMAIL PROTECTED] Date: Sat, 1 Mar 2008 20:02:03 -0800 Subject: [PATCH] add check_lft_balance script Signed-off-by: Albert L. Chu [EMAIL PROTECTED] --- infiniband-diags/Makefile.am |6 +- infiniband-diags/infiniband-diags.spec.in |4 + infiniband-diags/man/check_lft_balance.8 | 42 infiniband-diags/scripts/check_lft_balance.pl | 319 + 4 files changed, 369 insertions(+), 2 deletions(-) create mode 100644 infiniband-diags/man/check_lft_balance.8 create mode 100755 infiniband-diags/scripts/check_lft_balance.pl diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index ca66e2d..8bbda9e 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -25,7 +25,8 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ scripts/ibqueryerrors.pl scripts/ibswportwatch.pl \ scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ scripts/ibprintca.pl scripts/ibprintrt.pl \ - scripts/ibfindnodesusing.pl scripts/ibidsverify.pl + scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ + scripts/check_lft_balance.pl src_ibaddr_SOURCES = src/ibaddr.c src/ibdiag_common.c src_ibaddr_CFLAGS = -Wall $(DBGFLAGS) @@ -89,7 +90,8 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/iblinkinfo.8 man/ibqueryerrors.8 man/ibswportwatch.8 \ man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ - man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 + man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ + man/check_lft_balance.pl BUILT_SOURCES = ibdiag_version ibdiag_version: diff --git a/infiniband-diags/infiniband-diags.spec.in b/infiniband-diags/infiniband-diags.spec.in index 7a0e17b..9c8c0c4 100644 --- a/infiniband-diags/infiniband-diags.spec.in +++ b/infiniband-diags/infiniband-diags.spec.in @@ -48,6 +48,7 @@ rm -rf $RPM_BUILD_ROOT %{_sbindir}/vendstat %{_sbindir}/dump_mfts.sh %{_sbindir}/dump_lfts.sh +%{_sbindir}/check_lft_balance.pl %{_sbindir}/set_nodedesc.sh %{_sbindir}/sm* %define _perldir %(perl -e 'use Config; $T=$Config{installsitearch}; $T=~/(.*)\\/site_perl.*/; print $1;') @@ -56,6 +57,9 @@ rm -rf $RPM_BUILD_ROOT %doc README COPYING ChangeLog %changelog +* Mon Mar 03 2008 Albert Chu [EMAIL PROTECTED] - 1.3.5 +- Add check_lft_balance script. + * Wed Oct 31 2007 Ira Weiny [EMAIL PROTECTED] - 1.3.2 - Change switch-map option to node-name-map diff --git a/infiniband-diags/man/check_lft_balance.8 b/infiniband-diags/man/check_lft_balance.8 new file mode 100644 index 000..35243f6 --- /dev/null +++ b/infiniband-diags/man/check_lft_balance.8 @@ -0,0 +1,42 @@ +.TH CHECK_LFT_BALANCE.SH 8 March 1, 2008 OpenIB OpenIB Diagnostics + +.SH NAME +check_lft_balance.sh \- check InfiniBand unicast forwarding tables balance + +.SH SYNOPSIS +.B check_lft_balance.sh +[-hRv] + + +.SH DESCRIPTION +.PP +check_lft_balance.sh is a script which checks for balancing in Infiniband +unicast forwarding tables. It analyzes the output of +.BR dump_lfts(8) +and +.BR iblinkinfo(8). + +.SH OPTIONS + +.PP +.TP +\fB\-h\fR +show help +.TP +\fB\-R\fR +Recalculate dump_lfts information, ie do not use the cached +information. This option is slower but should be used if the diag tools have +not been used for some time or if there are other reasons to believe that +the fabric has changed. +.TP +\fB\-v\fR +verbose output + +.SH SEE ALSO +.BR dump_lfts(8), +.BR iblinkinfo(8) + +.SH AUTHORS +.TP +Albert Chu +.RI [EMAIL PROTECTED] diff --git a/infiniband-diags/scripts/check_lft_balance.pl b/infiniband-diags/scripts/check_lft_balance.pl new file mode 100755 index 000..c4186ed --- /dev/null +++ b/infiniband-diags/scripts/check_lft_balance.pl @@ -0,0 +1,319 @@ +#!/usr/bin/perl +# +# Copyright (C) 2001-2003 The Regents of the University of California. +# Copyright (c) 2006 The Regents of the University of California. +# Copyright (c) 2007 Voltaire, Inc. All rights reserved. +# +# Produced at Lawrence Livermore National Laboratory. +# Written by Ira Weiny [EMAIL PROTECTED] +#Jim Garlick [EMAIL PROTECTED] +#Albert Chu [EMAIL PROTECTED] +# +# This software is available to you under a choice of one of two +# licenses. You may choose to be licensed under the terms of the GNU +# General Public License (GPL) Version 2, available from the file +# COPYING in the main directory of this source tree, or the +# OpenIB.org BSD license below: +# +# Redistribution and use in source and binary forms, with or +#
[ofa-general] Re: [PATCH] mmu notifiers #v8 + xpmem
On Sun, 2008-03-02 at 17:03 +0100, Andrea Arcangeli wrote: 4) Then finally the mmu_notifier_unregister must be dropped to make the mmu notifier sleep capable with RCU in the mmu_notifier() fast path. Or require PREEMPTIBLE_RCU, that can handle sleeps.. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] mmu notifiers #v8
Difference between #v7 and #v8: 1) s/age_page/clear_flush_young/ (Nick's suggestion) 2) macro fix (Andrew) 3) move release before final unmap_vmas (for GRU, Jack/Christoph) 4) microoptimize mmu_notifier_unregister (Christoph) 5) use mmap_sem for registration serialization (Christoph) The (void)xxx in macros doesn't work with args. Christoph's solution look best in avoiding warnings, even if it forces to make the mmu notifier operation structure visible even if MMU_NOTIFIER=n (that's the only downside). I didn't drop invalidate_page, because invalidate_range_begin/end would be slower for usages like KVM/GRU (we don't need a begin/end there because where invalidate_page is called, the VM holds a reference on the page). do_wp_page should also use invalidate_page since it can free the page after dropping the PT lock without losing any performance (that's not true for the places where invalidate_range is called). It'd be nice if everyone involved can agree to converge on this API for .25. KVM/GRU (and perhaps Quadrics) and similar usages will be fully covered in .25. This is a kernel internal API so there's no problem if all the methods will become sleep capable only starting only in .26. The brainer part of the VM work to do to make it sleep capable is pretty much orthogonal with this patch. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -10,6 +10,7 @@ #include linux/rbtree.h #include linux/rwsem.h #include linux/completion.h +#include linux/mmu_notifier.h #include asm/page.h #include asm/mmu.h @@ -228,6 +229,8 @@ struct mm_struct { #ifdef CONFIG_CGROUP_MEM_CONT struct mem_cgroup *mem_cgroup; #endif + + struct mmu_notifier_head mmu_notifier; /* MMU notifier list */ }; #endif /* _LINUX_MM_TYPES_H */ diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h new file mode 100644 --- /dev/null +++ b/include/linux/mmu_notifier.h @@ -0,0 +1,161 @@ +#ifndef _LINUX_MMU_NOTIFIER_H +#define _LINUX_MMU_NOTIFIER_H + +#include linux/list.h +#include linux/spinlock.h + +struct mmu_notifier; + +struct mmu_notifier_ops { + /* +* Called when nobody can register any more notifier in the mm +* and after the mn notifier has been disarmed already. +*/ + void (*release)(struct mmu_notifier *mn, + struct mm_struct *mm); + + /* +* clear_flush_young is called after the VM is +* test-and-clearing the young/accessed bitflag in the +* pte. This way the VM will provide proper aging to the +* accesses to the page through the secondary MMUs and not +* only to the ones through the Linux pte. +*/ + int (*clear_flush_young)(struct mmu_notifier *mn, +struct mm_struct *mm, +unsigned long address); + + /* +* Before this is invoked any secondary MMU is still ok to +* read/write to the page previously pointed by the Linux pte +* because the old page hasn't been freed yet. If required +* set_page_dirty has to be called internally to this method. +*/ + void (*invalidate_page)(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long address); + + /* +* invalidate_range_begin() and invalidate_range_end() must be +* paired. Multiple invalidate_range_begin/ends may be nested +* or called concurrently. +*/ + void (*invalidate_range_begin)(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, unsigned long end); + void (*invalidate_range_end)(struct mmu_notifier *mn, +struct mm_struct *mm, +unsigned long start, unsigned long end); +}; + +struct mmu_notifier { + struct hlist_node hlist; + const struct mmu_notifier_ops *ops; +}; + +#ifdef CONFIG_MMU_NOTIFIER + +struct mmu_notifier_head { + struct hlist_head head; +}; + +#include linux/mm_types.h + +/* + * Must hold the mmap_sem for write. + * + * RCU is used to traverse the list. A quiescent period needs to pass + * before the notifier is guaranteed to be visible to all threads. + */ +extern void mmu_notifier_register(struct mmu_notifier *mn, + struct mm_struct *mm); +/* + * Must hold the mmap_sem for write. + * + * RCU is used to traverse the list. A quiescent period needs to pass + * before the struct mmu_notifier can be freed. Alternatively it + * can be synchronously freed inside -release when the list can't + * change anymore and nobody could possibly walk it. + */ +extern void mmu_notifier_unregister(struct
[ofa-general] Viagra
Viagra is an oral drug for male impotence, also known as erectile dysfunction. Having been around for a lot longer, Viagra has a great safety track record and proven effects that start acting in 30 minutes and last for about 5 hours. Please visit our site for more details. Type the URL below without spaces to visit us h t t p : / / b u r n h i t . c o m / ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [OpenSM] updn routing performance fix???
On Sun, 2008-03-02 at 14:17 +, Sasha Khapyorsky wrote: On 05:04 Sun 02 Mar , Hal Rosenstock wrote: On Sat, 2008-03-01 at 22:53 +, Sasha Khapyorsky wrote: On 19:59 Fri 29 Feb , Hal Rosenstock wrote: If that makes sense, then also query commands on this state would likely also. Not sure about this. It is dynamically updated flag, so it would be hard to catch a valid value by hand from the OpenSM console. I was referring to the balance state not that flag. Does that make more sense ? What do you mean? Routing dumps? A different routing dump reflecting balance or not and how out of balance. -- Hal Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [OpenSM] updn routing performance fix???
Hi Al, On Sun, 2008-03-02 at 07:16 -0800, Albert Chu wrote: Hey Hal, Are you saying a flag inside each osm_switch_t to indicate if that specific switch is balanced? I wasn't saying anything about implementation. I was saying there could be OpenSM console commands to 1. rebalance, and 2. display relevant state regarding balance/imbalance. The script I wrote for the balance check did have difficulty determining a lot of corner cases (is port connected to a CA? is it active? what ports are up vs. down links, etc.). At the end of the day you just output a lot of extra info and have to look through it manually. Although probably not easy as a whole, these calculations would be easier in opensm since that information is available. That's what I was suggesting rather than a separate diag script although the latter seems like it would be good too. -- Hal Al On 05:04 Sun 02 Mar , Hal Rosenstock wrote: On Sat, 2008-03-01 at 22:53 +, Sasha Khapyorsky wrote: On 19:59 Fri 29 Feb , Hal Rosenstock wrote: If that makes sense, then also query commands on this state would likely also. Not sure about this. It is dynamically updated flag, so it would be hard to catch a valid value by hand from the OpenSM console. I was referring to the balance state not that flag. Does that make more sense ? What do you mean? Routing dumps? Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Achieve all your dreams
Gain the greatest Schlong ever! !ck enlargement becomes much easier! http://mutyouch.com/___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: [OpenSM] updn routing performance fix???
On 09:51 Sun 02 Mar , Hal Rosenstock wrote: A different routing dump reflecting balance or not and how out of balance. This makes sense. Actually OpenSM has such sort of dump right now, but it is printed to stdout. Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [infiniband-diags] check_lft_balance script
Hey Sasha, Noticed one more thing I could clean up from the original script. here's a new one. Al Hey Sasha, Here's the script I mentioned before that I used for the balance checking earlier. Its nothing fancy but probably could be useful to others. Al -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From 656ba8ee8103fd27f3041570d8c44d823848abf4 Mon Sep 17 00:00:00 2001 From: Albert L. Chu [EMAIL PROTECTED] Date: Sat, 1 Mar 2008 20:02:03 -0800 Subject: [PATCH] add check_lft_balance script Signed-off-by: Albert L. Chu [EMAIL PROTECTED] --- infiniband-diags/Makefile.am |6 +- infiniband-diags/infiniband-diags.spec.in |4 + infiniband-diags/man/check_lft_balance.8 | 42 infiniband-diags/scripts/check_lft_balance.pl | 314 + 4 files changed, 364 insertions(+), 2 deletions(-) create mode 100644 infiniband-diags/man/check_lft_balance.8 create mode 100755 infiniband-diags/scripts/check_lft_balance.pl diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index ca66e2d..8bbda9e 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -25,7 +25,8 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ scripts/ibqueryerrors.pl scripts/ibswportwatch.pl \ scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ scripts/ibprintca.pl scripts/ibprintrt.pl \ - scripts/ibfindnodesusing.pl scripts/ibidsverify.pl + scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ + scripts/check_lft_balance.pl src_ibaddr_SOURCES = src/ibaddr.c src/ibdiag_common.c src_ibaddr_CFLAGS = -Wall $(DBGFLAGS) @@ -89,7 +90,8 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/iblinkinfo.8 man/ibqueryerrors.8 man/ibswportwatch.8 \ man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ - man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 + man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ + man/check_lft_balance.pl BUILT_SOURCES = ibdiag_version ibdiag_version: diff --git a/infiniband-diags/infiniband-diags.spec.in b/infiniband-diags/infiniband-diags.spec.in index 7a0e17b..9c8c0c4 100644 --- a/infiniband-diags/infiniband-diags.spec.in +++ b/infiniband-diags/infiniband-diags.spec.in @@ -48,6 +48,7 @@ rm -rf $RPM_BUILD_ROOT %{_sbindir}/vendstat %{_sbindir}/dump_mfts.sh %{_sbindir}/dump_lfts.sh +%{_sbindir}/check_lft_balance.pl %{_sbindir}/set_nodedesc.sh %{_sbindir}/sm* %define _perldir %(perl -e 'use Config; $T=$Config{installsitearch}; $T=~/(.*)\\/site_perl.*/; print $1;') @@ -56,6 +57,9 @@ rm -rf $RPM_BUILD_ROOT %doc README COPYING ChangeLog %changelog +* Mon Mar 03 2008 Albert Chu [EMAIL PROTECTED] - 1.3.5 +- Add check_lft_balance script. + * Wed Oct 31 2007 Ira Weiny [EMAIL PROTECTED] - 1.3.2 - Change switch-map option to node-name-map diff --git a/infiniband-diags/man/check_lft_balance.8 b/infiniband-diags/man/check_lft_balance.8 new file mode 100644 index 000..35243f6 --- /dev/null +++ b/infiniband-diags/man/check_lft_balance.8 @@ -0,0 +1,42 @@ +.TH CHECK_LFT_BALANCE.SH 8 March 1, 2008 OpenIB OpenIB Diagnostics + +.SH NAME +check_lft_balance.sh \- check InfiniBand unicast forwarding tables balance + +.SH SYNOPSIS +.B check_lft_balance.sh +[-hRv] + + +.SH DESCRIPTION +.PP +check_lft_balance.sh is a script which checks for balancing in Infiniband +unicast forwarding tables. It analyzes the output of +.BR dump_lfts(8) +and +.BR iblinkinfo(8). + +.SH OPTIONS + +.PP +.TP +\fB\-h\fR +show help +.TP +\fB\-R\fR +Recalculate dump_lfts information, ie do not use the cached +information. This option is slower but should be used if the diag tools have +not been used for some time or if there are other reasons to believe that +the fabric has changed. +.TP +\fB\-v\fR +verbose output + +.SH SEE ALSO +.BR dump_lfts(8), +.BR iblinkinfo(8) + +.SH AUTHORS +.TP +Albert Chu +.RI [EMAIL PROTECTED] diff --git a/infiniband-diags/scripts/check_lft_balance.pl b/infiniband-diags/scripts/check_lft_balance.pl new file mode 100755 index 000..954f319 --- /dev/null +++ b/infiniband-diags/scripts/check_lft_balance.pl @@ -0,0 +1,314 @@ +#!/usr/bin/perl +# +# Copyright (C) 2001-2003 The Regents of the University of California. +# Copyright (c) 2006 The Regents of the University of California. +# Copyright (c) 2007 Voltaire, Inc. All rights reserved. +# +# Produced at Lawrence Livermore National Laboratory. +#
[ofa-general] [PATCH] opensm: set SA attribute offset to 0 when no records are returned
IBA 1.2.1 clarifies (t.187, p.897) that SA Attribute offset shell be set to zero if zero attributes are returned. Fix this. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- opensm/opensm/osm_sa.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index d85463e..46c5bf7 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -372,6 +372,8 @@ osm_sa_send_error(IN osm_sa_t * sa, if (p_resp_sa_mad-method == IB_MAD_METHOD_SET) p_resp_sa_mad-method = IB_MAD_METHOD_GET; + else if (p_resp_sa_mad-method == IB_MAD_METHOD_GETTABLE) + p_resp_sa_mad-attr_offset = 0; p_resp_sa_mad-method |= IB_MAD_METHOD_RESP_MASK; @@ -473,7 +475,7 @@ void osm_sa_respond(osm_sa_t *sa, osm_madw_t *madw, size_t attr_size, resp_sa_mad-sm_key = 0; /* Fill in the offset (paylen will be done by the rmpp SAR) */ - resp_sa_mad-attr_offset = ib_get_attr_offset(attr_size); + resp_sa_mad-attr_offset = num_rec ? ib_get_attr_offset(attr_size) : 0; p = ib_sa_mad_get_payload_ptr(resp_sa_mad); -- 1.5.4.rc2.60.gb2e62 ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH] opensm: rename osm_sa_vendor_send() to osm_sa_send()
Rename osm_sa_vendor_send() to osm_sa_send() (since it is not part of vendor library). Also it changes prototype to match better other SA sender functions. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- opensm/include/opensm/osm_sa.h | 17 ++--- opensm/opensm/osm_inform.c |3 +-- opensm/opensm/osm_sa.c | 21 + opensm/opensm/osm_sa_class_port_info.c |2 +- 4 files changed, 17 insertions(+), 26 deletions(-) diff --git a/opensm/include/opensm/osm_sa.h b/opensm/include/opensm/osm_sa.h index f4f751b..370e4e0 100644 --- a/opensm/include/opensm/osm_sa.h +++ b/opensm/include/opensm/osm_sa.h @@ -351,20 +351,17 @@ osm_sa_bind(IN osm_sa_t * const p_sa, IN const ib_net64_t port_guid); * SEE ALSO */ -/f* OpenSM: SA/osm_sa_vendor_send +/f* OpenSM: SA/osm_sa_send * NAME -* osm_sa_vendor_send +* osm_sa_send * * DESCRIPTION * Sends SA MAD via osm_vendor_send and maintains the QP1 sent statistic * * SYNOPSIS */ -ib_api_status_t -osm_sa_vendor_send(IN osm_bind_handle_t h_bind, - IN osm_madw_t * const p_madw, - IN boolean_t const resp_expected, - IN osm_subn_t * const p_subn); +ib_api_status_t osm_sa_send(osm_sa_t *sa, IN osm_madw_t * const p_madw, + IN boolean_t const resp_expected); /f* IBA Base: Types/osm_sa_send_error * NAME @@ -376,10 +373,8 @@ osm_sa_vendor_send(IN osm_bind_handle_t h_bind, * * SYNOPSIS */ -void -osm_sa_send_error(IN osm_sa_t * sa, - IN const osm_madw_t * const p_madw, - IN const ib_net16_t sa_status); +void osm_sa_send_error(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, + IN const ib_net16_t sa_status); /* * PARAMETERS * sa diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c index bbd573c..9553f7f 100644 --- a/opensm/opensm/osm_inform.c +++ b/opensm/opensm/osm_inform.c @@ -365,8 +365,7 @@ static ib_api_status_t __osm_send_report(IN osm_infr_t * p_infr_rec,/* the info *p_report_ntc = *p_ntc; /* The TRUE is for: response is expected */ - osm_sa_vendor_send(p_report_madw-h_bind, p_report_madw, TRUE, - p_infr_rec-sa-p_subn); + osm_sa_send(p_infr_rec-sa, p_report_madw, TRUE); Exit: OSM_LOG_EXIT(p_log); diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index 4edce47..d85463e 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -318,19 +318,17 @@ Exit: return (status); } -ib_api_status_t -osm_sa_vendor_send(IN osm_bind_handle_t h_bind, - IN osm_madw_t * const p_madw, - IN boolean_t const resp_expected, - IN osm_subn_t * const p_subn) +ib_api_status_t osm_sa_send(osm_sa_t *sa, + IN osm_madw_t * const p_madw, + IN boolean_t const resp_expected) { ib_api_status_t status; - cl_atomic_inc(p_subn-p_osm-stats.sa_mads_sent); - status = osm_vendor_send(h_bind, p_madw, resp_expected); + cl_atomic_inc(sa-p_subn-p_osm-stats.sa_mads_sent); + status = osm_vendor_send(p_madw-h_bind, p_madw, resp_expected); if (status != IB_SUCCESS) { - cl_atomic_dec(p_subn-p_osm-stats.sa_mads_sent); - OSM_LOG(p_subn-p_osm-log, OSM_LOG_ERROR, ERR 4C04: + cl_atomic_dec(sa-p_subn-p_osm-stats.sa_mads_sent); + OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 4C04: osm_vendor_send failed, status = %s\n, ib_get_err_str(status)); } @@ -392,8 +390,7 @@ osm_sa_send_error(IN osm_sa_t * sa, if (osm_log_is_active(sa-p_log, OSM_LOG_FRAMES)) osm_dump_sa_mad(sa-p_log, p_resp_sa_mad, OSM_LOG_FRAMES); - osm_sa_vendor_send(osm_madw_get_bind_handle(p_resp_madw), - p_resp_madw, FALSE, sa-p_subn); + osm_sa_send(sa, p_resp_madw, FALSE); Exit: OSM_LOG_EXIT(sa-p_log); @@ -501,7 +498,7 @@ void osm_sa_respond(osm_sa_t *sa, osm_madw_t *madw, size_t attr_size, p += attr_size; } - osm_sa_vendor_send(resp_madw-h_bind, resp_madw, FALSE, sa-p_subn); + osm_sa_send(sa, resp_madw, FALSE); osm_dump_sa_mad(sa-p_log, resp_sa_mad, OSM_LOG_FRAMES); Exit: diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index 3a76a69..f0afb32 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -174,7 +174,7 @@ __osm_cpi_rcv_respond(IN osm_sa_t * sa, if (osm_log_is_active(sa-p_log, OSM_LOG_FRAMES)) osm_dump_sa_mad(sa-p_log, p_resp_sa_mad, OSM_LOG_FRAMES); - osm_sa_vendor_send(p_resp_madw-h_bind, p_resp_madw, FALSE, sa-p_subn); + osm_sa_send(sa, p_resp_madw, FALSE); Exit:
Re: [ofa-general] [PATCH] opensm: set SA attribute offset to 0 when no records are returned
Sasha Khapyorsky wrote: IBA 1.2.1 clarifies (t.187, p.897) that SA Attribute offset shell be set to zero if zero attributes are returned. Fix this. Nice catch, thanks. BTW, are you aware of any other IBA 1.2.1 - related issues that need to be fixed? I mean, is OpenSM fully IBA 1.2.1 compliant? -- Yevgeny Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- opensm/opensm/osm_sa.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index d85463e..46c5bf7 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -372,6 +372,8 @@ osm_sa_send_error(IN osm_sa_t * sa, if (p_resp_sa_mad-method == IB_MAD_METHOD_SET) p_resp_sa_mad-method = IB_MAD_METHOD_GET; + else if (p_resp_sa_mad-method == IB_MAD_METHOD_GETTABLE) + p_resp_sa_mad-attr_offset = 0; p_resp_sa_mad-method |= IB_MAD_METHOD_RESP_MASK; @@ -473,7 +475,7 @@ void osm_sa_respond(osm_sa_t *sa, osm_madw_t *madw, size_t attr_size, resp_sa_mad-sm_key = 0; /* Fill in the offset (paylen will be done by the rmpp SAR) */ - resp_sa_mad-attr_offset = ib_get_attr_offset(attr_size); + resp_sa_mad-attr_offset = num_rec ? ib_get_attr_offset(attr_size) : 0; p = ib_sa_mad_get_payload_ptr(resp_sa_mad); ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [nfs-rdma-devel] [ofa-general] Status of NFS-RDMA ? (fwd)
On Fri, 2008-02-29 at 09:29 +0100, Sebastian Schmitzdorff wrote: hi pawel, I was wondering if you have achieved better nfs rdma benchmark results by now? Pawel: What is your network hardware setup? Thanks, Tom regards Sebastian Pawel Dziekonski schrieb: hi, the saga continues. ;) very basic benchmarks and surprising (at least for me) results - it look's like reading is much slower than writing and NFS/RDMA is twice slower in reading than classic NFS. :o results below - comments appreciated! regards, Pawel both nfs server and client have 8-cores, 16 GB RAM, Mellanox DDR HCAs (MT25204) connected port-port (no switch). local_hdd - 2 sata2 disks in soft-raid0, nfs_ipoeth - classic nfs over ethernet, nfs_ipoib - classic nfs over IPoIB, nfs_rdma - NFS/RDMA. simple write of 36GB file with dd (both machines have 16GB RAM): /usr/bin/time -p dd if=/dev/zero of=/mnt/qqq bs=1M count=36000 local_hddsys 54.52user 0.04real 254.59 nfs_ipoibsys 36.35user 0.00real 266.63 nfs_rdma sys 39.03user 0.02real 323.77 nfs_ipoeth sys 34.21user 0.01real 375.24 remount /mnt to clear cache and read a file from nfs share and write it to /dev/: /usr/bin/time -p dd if=/mnt/qqq of=/scratch/qqq bs=1M nfs_ipoib sys 59.04user 0.02real 571.57 nfs_ipoeth sys 58.92user 0.02real 606.61 nfs_rdmasys 62.57user 0.03real 1296.36 results from bonnie++: Version 1.03c --Sequential Write -- --Sequential Read -- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP local_hdd 35G:128k 93353 12 58329 6 143293 7 243.6 1 local_hdd 35G:256k 92283 11 58189 6 144202 8 172.2 2 local_hdd 35G:512k 93879 12 57715 6 144167 8 128.2 4 local_hdd 35G:1024k 93075 12 58637 6 144172 8 95.3 7 nfs_ipoeth 35G:128k 91325 7 31848 464299 4 170.2 1 nfs_ipoeth 35G:256k 90668 7 32036 564542 4 163.2 2 nfs_ipoeth 35G:512k 93348 7 31757 564454 4 85.7 3 nfs_ipoet 35G:1024k 91283 7 31869 564241 5 51.7 4 nfs_ipoib 35G:128k 91733 7 36641 565839 4 178.4 2 nfs_ipoib 35G:256k 92453 7 36567 666682 4 166.9 3 nfs_ipoib 35G:512k 91157 7 37660 666318 4 86.8 3 nfs_ipoib 35G:1024k 92111 7 35786 666277 5 53.3 4 nfs_rdma 35G:128k 91152 8 29942 532147 2 187.0 1 nfs_rdma 35G:256k 89772 7 30560 534587 2 158.4 3 nfs_rdma 35G:512k 91290 7 29698 534277 2 60.9 2 nfs_rdma 35G:1024k 91336 8 29052 531742 2 41.5 3 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min/sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP local_hdd16 10587 36 + +++ 8674 29 10727 35 + +++ 7015 28 local_hdd16 11372 41 + +++ 8490 29 11192 43 + +++ 6881 27 local_hdd16 10789 35 + +++ 8520 29 11468 46 + +++ 6651 24 local_hdd16 10841 40 + +++ 8443 28 11162 41 + +++ 6441 22 nfs_ipoeth 16 3753 7 13390 12 3795 7 3773 8 22181 16 3635 7 nfs_ipoeth 16 3762 8 12358 7 3713 8 3753 7 20448 13 3632 6 nfs_ipoeth 16 3834 7 12697 6 3729 8 3725 9 22807 11 3673 7 nfs_ipoeth 16 3729 8 14260 10 3774 7 3744 7 25285 14 3688 7 nfs_ipoib16 6803 17 + +++ 6843 15 6820 14 + +++ 5834 11 nfs_ipoib16 6587 16 + +++ 4959 9 6832 14 + +++ 5608 12 nfs_ipoib16 6820 18 + +++ 6636 15 6479 15 + +++ 5679 13 nfs_ipoib16 6475 14 + +++ 6435 14 5543 11 + +++ 5431 11 nfs_rdma 16 7014 15 + +++ 6714 10 7001 14 + +++ 5683 8 nfs_rdma 16 7038 13 + +++ 6713 12 6956 11 + +++ 5488 8 nfs_rdma 16 7058 12 + +++ 6797 11 6989 14 + +++ 5761 9 nfs_rdma 16 7201 13 + +++ 6821 12 7072 15 + +++ 5609 9 ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit
[ofa-general] Re: [PATCH] mmu notifiers #v8
On Sun, Mar 02, 2008 at 04:54:57PM +0100, Andrea Arcangeli wrote: Difference between #v7 and #v8: 1) s/age_page/clear_flush_young/ (Nick's suggestion) 2) macro fix (Andrew) 3) move release before final unmap_vmas (for GRU, Jack/Christoph) 4) microoptimize mmu_notifier_unregister (Christoph) 5) use mmap_sem for registration serialization (Christoph) The (void)xxx in macros doesn't work with args. Christoph's solution look best in avoiding warnings, even if it forces to make the mmu notifier operation structure visible even if MMU_NOTIFIER=n (that's the only downside). I have a couple of cleanup patches that change the structure of this to something I prefer. Others may not, but I'll post them for debate anyway. I didn't drop invalidate_page, because invalidate_range_begin/end would be slower for usages like KVM/GRU (we don't need a begin/end there because where invalidate_page is called, the VM holds a reference on the page). do_wp_page should also use invalidate_page since it can free the page after dropping the PT lock without losing any performance (that's not true for the places where invalidate_range is called). I'm still not completely happy with this. I had a very quick look at the GRU driver, but I don't see why it can't be implemented more like the regular TLB model, and have TLB insertions depend on the linux pte, and do invalidates _after_ restricting permissions to the pte. Ie. I'd still like to get rid of invalidate_range_begin, and get rid of invalidate calls from places where permissions are relaxed. It'd be nice if everyone involved can agree to converge on this API for .25. KVM/GRU (and perhaps Quadrics) and similar usages will be fully covered in .25. If we can agree on the API, then I don't see any reason why it can't go into 2.6.25, unless someome wants more time to review it (but 2.6.25 release should be quite far away still so there should be quite a bit of time). ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [PATCH] mmu notifiers #v8
On Sun, Mar 02, 2008 at 04:54:57PM +0100, Andrea Arcangeli wrote: Difference between #v7 and #v8: [patch] mmu-v8: demacro Remove the macros from mmu_notifier.h, in favour of functions. This requires untangling the include order circular dependencies as well, so just remove struct mmu_notifier_head in favour of just using the hlist in mm_struct. Signed-off-by: Nick Piggin [EMAIL PROTECTED] --- Index: linux-2.6/include/linux/mmu_notifier.h === --- linux-2.6.orig/include/linux/mmu_notifier.h +++ linux-2.6/include/linux/mmu_notifier.h @@ -55,12 +55,13 @@ struct mmu_notifier { #ifdef CONFIG_MMU_NOTIFIER -struct mmu_notifier_head { - struct hlist_head head; -}; - #include linux/mm_types.h +static inline int mm_has_notifiers(struct mm_struct *mm) +{ + return unlikely(!hlist_empty(mm-mmu_notifier_list)); +} + /* * Must hold the mmap_sem for write. * @@ -79,33 +80,59 @@ extern void mmu_notifier_register(struct */ extern void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm); -extern void mmu_notifier_release(struct mm_struct *mm); -extern int mmu_notifier_clear_flush_young(struct mm_struct *mm, + +extern void __mmu_notifier_release(struct mm_struct *mm); +extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm, unsigned long address); +extern void __mmu_notifier_invalidate_page(struct mm_struct *mm, + unsigned long address); +extern void __mmu_notifier_invalidate_range_begin(struct mm_struct *mm, + unsigned long start, unsigned long end); +extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm, + unsigned long start, unsigned long end); + + +static inline void mmu_notifier_release(struct mm_struct *mm) +{ + if (mm_has_notifiers(mm)) + __mmu_notifier_release(mm); +} + +static inline int mmu_notifier_clear_flush_young(struct mm_struct *mm, + unsigned long address) +{ + if (mm_has_notifiers(mm)) + return __mmu_notifier_clear_flush_young(mm, address); + return 0; +} + +static inline void mmu_notifier_invalidate_page(struct mm_struct *mm, + unsigned long address) +{ + if (mm_has_notifiers(mm)) + __mmu_notifier_invalidate_page(mm, address); +} + +static inline void mmu_notifier_invalidate_range_begin(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + if (mm_has_notifiers(mm)) + __mmu_notifier_invalidate_range_begin(mm, start, end); +} + +static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + if (mm_has_notifiers(mm)) + __mmu_notifier_invalidate_range_end(mm, start, end); +} -static inline void mmu_notifier_head_init(struct mmu_notifier_head *mnh) +static inline void mmu_notifier_mm_init(struct mm_struct *mm) { - INIT_HLIST_HEAD(mnh-head); + INIT_HLIST_HEAD(mm-mmu_notifier_list); } -#define mmu_notifier(function, mm, args...)\ - do {\ - struct mmu_notifier *__mn; \ - struct hlist_node *__n; \ - struct mm_struct * __mm = mm; \ - \ - if (unlikely(!hlist_empty(__mm-mmu_notifier.head))) { \ - rcu_read_lock();\ - hlist_for_each_entry_rcu(__mn, __n, \ -__mm-mmu_notifier.head, \ -hlist) \ - if (__mn-ops-function)\ - __mn-ops-function(__mn, \ - __mm, \ - args); \ - rcu_read_unlock(); \ - } \ - } while (0) + #define ptep_clear_flush_notify(__vma, __address, __ptep) \ ({ \ @@ -113,7 +140,7 @@ static inline void mmu_notifier_head_ini struct vm_area_struct * ___vma = __vma; \ unsigned long ___address = __address; \ __pte = ptep_clear_flush(___vma, ___address, __ptep); \
[ofa-general] Re: [PATCH] mmu notifiers #v8
On Sun, Mar 02, 2008 at 04:54:57PM +0100, Andrea Arcangeli wrote: Difference between #v7 and #v8: This one on top of the previous patch [patch] mmu-v8: typesafe Move definition of struct mmu_notifier and struct mmu_notifier_ops under CONFIG_MMU_NOTIFIER to ensure they doesn't get dereferenced when they don't make sense. Signed-off-by: Nick Piggin [EMAIL PROTECTED] --- Index: linux-2.6/include/linux/mmu_notifier.h === --- linux-2.6.orig/include/linux/mmu_notifier.h +++ linux-2.6/include/linux/mmu_notifier.h @@ -3,8 +3,12 @@ #include linux/list.h #include linux/spinlock.h +#include linux/mm_types.h struct mmu_notifier; +struct mmu_notifier_ops; + +#ifdef CONFIG_MMU_NOTIFIER struct mmu_notifier_ops { /* @@ -53,10 +57,6 @@ struct mmu_notifier { const struct mmu_notifier_ops *ops; }; -#ifdef CONFIG_MMU_NOTIFIER - -#include linux/mm_types.h - static inline int mm_has_notifiers(struct mm_struct *mm) { return unlikely(!hlist_empty(mm-mmu_notifier_list)); ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] I'd like to show you my pic
Hello! I am tired today. I am nice girl that would like to chat with you. Email me at [EMAIL PROTECTED] only, because I am using my friend's email to write this. Don't miss some of my naughty pictures. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [PATCH] mmu notifiers #v8
On Sun, Mar 02, 2008 at 04:54:57PM +0100, Andrea Arcangeli wrote: Difference between #v7 and #v8: Here is just a couple of checkpatch fixes on top of the last patches. Index: linux-2.6/include/linux/mmu_notifier.h === --- linux-2.6.orig/include/linux/mmu_notifier.h +++ linux-2.6/include/linux/mmu_notifier.h @@ -46,7 +46,7 @@ struct mmu_notifier_ops { */ void (*invalidate_range_begin)(struct mmu_notifier *mn, struct mm_struct *mm, - unsigned long start, unsigned long end); + unsigned long start, unsigned long end); void (*invalidate_range_end)(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, unsigned long end); @@ -137,7 +137,7 @@ static inline void mmu_notifier_mm_init( #define ptep_clear_flush_notify(__vma, __address, __ptep) \ ({ \ pte_t __pte;\ - struct vm_area_struct * ___vma = __vma; \ + struct vm_area_struct *___vma = __vma; \ unsigned long ___address = __address; \ __pte = ptep_clear_flush(___vma, ___address, __ptep); \ mmu_notifier_invalidate_page(___vma-vm_mm, ___address);\ @@ -147,7 +147,7 @@ static inline void mmu_notifier_mm_init( #define ptep_clear_flush_young_notify(__vma, __address, __ptep) \ ({ \ int __young;\ - struct vm_area_struct * ___vma = __vma; \ + struct vm_area_struct *___vma = __vma; \ unsigned long ___address = __address; \ __young = ptep_clear_flush_young(___vma, ___address, __ptep); \ __young |= mmu_notifier_clear_flush_young(___vma-vm_mm,\ ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges
On Thursday 28 February 2008 09:35, Christoph Lameter wrote: On Wed, 20 Feb 2008, Nick Piggin wrote: On Friday 15 February 2008 17:49, Christoph Lameter wrote: Also, what we are going to need here are not skeleton drivers that just do all the *easy* bits (of registering their callbacks), but actual fully working examples that do everything that any real driver will need to do. If not for the sanity of the driver writer, then for the sanity of the VM developers (I don't want to have to understand xpmem or infiniband in order to understand how the VM works). There are 3 different drivers that can already use it but the code is complex and not easy to review. Skeletons are easy to allow people to get started with it. Your skeleton is just registering notifiers and saying /* you fill the hard part in */ If somebody needs a skeleton in order just to register the notifiers, then almost by definition they are unqualified to write the hard part ;) lru_add_drain(); tlb = tlb_gather_mmu(mm, 0); update_hiwater_rss(mm); + mmu_notifier(invalidate_range_begin, mm, address, end, atomic); end = unmap_vmas(tlb, vma, address, end, nr_accounted, details); if (tlb) tlb_finish_mmu(tlb, address, end); + mmu_notifier(invalidate_range_end, mm, address, end, atomic); return end; } Where do you invalidate for munmap()? zap_page_range() called from unmap_vmas(). But it is not allowed to sleep. Where do you call the sleepable one from? Also, how to you resolve the case where you are not allowed to sleep? I would have thought either you have to handle it, in which case nobody needs to sleep; or you can't handle it, in which case the code is broken. That can be done in a variety of ways: 1. Change VM locking 2. Not handle file backed mappings (XPmem could work mostly in such a config) 3. Keep the refcount elevated until pages are freed in another execution context. OK, there are ways to solve it or hack around it. But this is exactly why I think the implementations should be kept seperate. Andrea's notifiers are coherent, work on all types of mappings, and will hopefully match closely the regular TLB invalidation sequence in the Linux VM (at the moment it is quite close, but I hope to make it a bit closer) so that it requires almost no changes to the mm. All the other things to try to make it sleep are either hacking holes in it (eg by removing coherency). So I don't think it is reasonable to require that any patch handle all cases. I actually think Andrea's patch is quite nice and simple itself, wheras I am against the patches that you posted. What about a completely different approach... XPmem runs over NUMAlink, right? Why not provide some non-sleeping way to basically IPI remote nodes over the NUMAlink where they can process the invalidation? If you intra-node cache coherency has to run over this link anyway, then presumably it is capable. Or another idea, why don't you LD_PRELOAD in the MPT library to also intercept munmap, mprotect, mremap etc as well as just fork()? That would give you similarly good enough coherency as the mmu notifier patches except that you can't swap (which Robin said was not a big problem). ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] legally Erase Your Credit Card Debt
Get Out of Debt Today. Avoid Bankruptcy. Save Thousands... The Professional Way!! http://ilionf.com.cn/___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general