Hi all,
I just tried this same test on OGS/GE 2011.11p1 and it works
perfectly.
On Wed, Feb 6, 2013 at 6:28 PM, Ben De Luca <[email protected]> wrote:
> I have a 8.0.0c cluster in production and an 8.0.0e running for testing.
>
> No one has noticed it, though I have seen it before
> Oh, I just managed to hit another bug, on 8.0.0c
>
>
> I am trying to simulate this.
> I have 2 users, (names are changed to protect the innocent)
>
> user1
> echo sleep 10000 | qsub -q linux2 -t 1-100 -tc 1
>
> user2
> echo sleep 10000 | qsub -q linux2 -t 1-2000 -tc 1
>
>
>
> qstat -ext -pri -u user1,user2
>
> job-ID prior nurg npprior ntckts ppri name user
> project department state submit/start at cpu mem io
> tckts ovrts otckt ftckt stckt share queue
> slots ja-task-ID
> 115460 2.50000 0.00000 0.00000 1.00000 0 STDIN user2 NA
> defaultdep r 02/06/2013 18:21:21 0:00:00:00 0.00000 0.00007
> -1780537303 0 0 0 -1780537303 0.59 linux2@ 1 1
> 115461 1.91619 0.00000 0.00000 0.70810 0 STDIN user1 NA
> defaultdep r 02/06/2013 18:21:21 0:00:00:00 0.00000 0.00007
> 1780458312 0 0 0 1780458312 0.41 linux2@ 1 1
> 115460 0.00000 0.00000 0.00000 0.00000 0 STDIN user2 NA
> defaultdep qw 02/06/2013 18:21:05
> 0 0 0 0 0 0.00 1
> 2-2000:1
> 115461 0.00000 0.00000 0.00000 0.00000 0 STDIN user1 NA
> defaultdep qw 02/06/2013 18:21:13
> 0 0 0 0 0 0.00 1
> 2-100:1
>
>
> user2, gets more tickets, and have overflowed into the negative.
>
>
>
>
>
> On Wed, Feb 6, 2013 at 2:05 PM, Orlando Richards <
> [email protected]> wrote:
>
>> Hi Ben,
>>
>>
>> On 06/02/13 13:12, Ben De Luca wrote:
>>
>>> Im fairly sure we are affected by this bug too, I am happy to help in
>>> the hunt and I have looked through the code more than once.
>>>
>>>
>> Are you doing anything to work around it at all? At the moment, we're
>> adjusting the shares to accommodate the over accounting - but that is a
>> very blunt tool and skews our allocations massively. We're reluctant to go
>> for purely functional shares, as our service definition is currently fixed
>> on fair share.
>>
>>
>> Which version of grid are you trying to fix? I havn't been following
>>> grid dev too closely do we still have multiple forks?
>>>
>>>
>> We notice it most on our current 6.2u5 deployment, which we're moving
>> away from to 8.0.0e from Son Of Grid Engine. That's not to say it isn't
>> present in the 8.0.0e - we still have a lot of the troublesome workload on
>> the 6.2u5 cluster, and I'm sure I've seen it happening on the 8.0.0e
>> cluster (though I now don't have any evidence of that).
>>
>>
>> --
>> Orlando
>>
>>
>>
>>>
>>> On Wed, Feb 6, 2013 at 12:07 PM, Mark Dixon <[email protected]
>>> <mailto:[email protected]>**> wrote:
>>>
>>> On Wed, 6 Feb 2013, Orlando Richards wrote:
>>> ...
>>>
>>> I've had a go at digging through the code, but couldn't really
>>> make head
>>>
>>> nor tail of it - no doubt in large part due to my not being much
>>> of a
>>> coder :( Any pointers to get me bootstrapped would be most
>>> welcome.
>>>
>>>
>>> General comments about the source...
>>>
>>> Don't be intimidated. It's a large code base, but spend a little
>>> time and it'll start to make sense. Pick a little bit of it to focus
>>> on initially.
>>>
>>> Gridengine's source code is layered. The source distribution has a
>>> few HTML files describing them (some of which still need updating
>>> from the 6.0 days...). Functions near the very top and very bottom
>>> of the stack are relatively well commented, but the rest can be a
>>> little hit and miss.
>>>
>>> Ignoring most of the layers, you've essentially got:
>>>
>>> At the bottom you've got the wonderful CULL layer: it's very solid
>>> and provides gridengine with safe complicated data structures. I'd
>>> like to pat the person who wrote it on the back, although I admit
>>> I've yet to get my head round the advanced search functionality.
>>> State data for jobs and so on tend to use it. Use of it can be
>>> identified by the data types or functions prefixed with "l".
>>>
>>> While I'm on data structures, there are also "dstrings" - which
>>> provide safe string handling.
>>>
>>> In the middle you've got the GDI, which is the set of libraries used
>>> by the different components to communicate with each other over the
>>> network.
>>>
>>> At the top you've got the qmaster, execd, etc., which can be thought
>>> of as loosely coupled applications that all use the same underlying
>>> libraries/layers to coordinate.
>>>
>>> I've spent most of my time in the execd, which is pretty easy but
>>> messy [a very large number of special cases - not totally unexpected
>>> with the number of platforms supported over the years, but ripe for
>>> some refactoring]. I've had a brief play in the qmaster and my first
>>> impression is that it's more consistent and "solid" than the execd,
>>> but more complicated.
>>>
>>>
>>> General tips for debugging gridengine...
>>>
>>> 1) Play with the loglevel setting in "qconf -sconf" and read the
>>> messages files.
>>>
>>>
>>> 2) Figure out how to stick gridengine into debug mode.
>>> https://blogs.oracle.com/__**templedf/entry/using___**
>>> debugging_output<https://blogs.oracle.com/__templedf/entry/using___debugging_output>
>>>
>>>
>>> <https://blogs.oracle.com/**templedf/entry/using_**debugging_output<https://blogs.oracle.com/templedf/entry/using_debugging_output>
>>> >
>>>
>>> Essentially something like:
>>> * Setup sge environment (SGE_ROOT, SGE_QMASTER_PORT, etc.)
>>> * Execute: . $SGE_ROOT/util/dl.sh
>>> * Execute: dl 1
>>> * Execute: $SGE_ROOT/bin/lx-amd64/sge___**execd
>>>
>>>
>>> The program will not daemonise and will print lots of interesting
>>> stuff. Different 'dl' values will give you different output. I
>>> generally find that anything greater than 1 is "too much".
>>>
>>> This technique will work for pretty much any gridengine component.
>>> Even qsub.
>>>
>>>
>>> 3) Run gridengine under gdb.
>>>
>>> I don't know if you've had much experience with gdb but, once you've
>>> got the hang of it, it's very useful in figuring out what some code
>>> generally does without actually understanding the details. Once
>>> you've followed your nose to something that doesn't look right, you
>>> can then spend time figuring things out.
>>>
>>> I think some of the gridengine forks try to provide builds with
>>> enough debugging information for this to work, but I tend to build
>>> my own gridengine so that I can easily recompile after editing the
>>> source with potential fixes.
>>>
>>> Make sure you build with the "-no-opt" and "-debug" flags to aimk
>>> (disables optimisation and enables debugging symbols) and keep the
>>> source tree kicking around for gdb to read. I run our production
>>> gridengine with those flags and haven't noticed any serious
>>> performance problems.
>>>
>>> Once you have gridengine running under gdb and playing with
>>> breakpoints and the rest, you can easily examine interesting data
>>> structures with commands like "p lWriteList(ptr)", "p
>>> lWriteElem(ptr)" and
>>> "p sge_dstring_get_string(ptr)" (where ptr is a lList*, lListElem*
>>> or dstring*, respectively).
>>>
>>>
>>> ...
>>>
>>> At the moment, I'm trying to get a reproducible test case
>>> together to
>>> allow for useful debugging - basic tests (sleep 60s) don't show
>>> an
>>> obvious triggering of the issue, so I'm moving onto more
>>> complicated
>>> tasks. Certainly, the issue does seem to create
>>> orders-of-magnitude
>>> differences in reported usage. Current offenders include BLAST
>>> jobs (run
>>> by our Biology users) - which are fairly memory heavy.
>>>
>>> ...
>>>
>>> Being able to reproduce the problem will obviously make things far,
>>> far easier! If you cannot, you're probably reduced to littering the
>>> relevant qmaster code with INFO(())/WARNING(())/ERROR(()) statements
>>> (and checking that loglevel in "qconf -sconf" is set to the
>>> appropriate value) and seeing what appears in the messages files in
>>> production.
>>>
>>> If you're lucky, the problem might be evident in the usage
>>> information being sent from the execd to the qmaster. Running the
>>> execd in debug mode with "dl 1" will reveal what CPU/MEM/IO values
>>> the qmaster is being given to be used in the accounting file and the
>>> share tree.
>>>
>>> If you're unlucky, the problem is in how the qmaster aggregates,
>>> records and decays the share tree values over time.
>>>
>>> If you're really unlucky, the problem might only occur if the
>>> various gridengine components are under severe stress.
>>>
>>> I find that having a non-production installation of gridengine
>>> kicking around, perhaps in virtual machines, is very handy :)
>>>
>>> Hope this helps...
>>>
>>>
>>> Mark
>>> --
>>> ------------------------------**__----------------------------**
>>> --__-----
>>> Mark Dixon Email : [email protected]
>>> <mailto:[email protected]>
>>>
>>> HPC/Grid Systems Support Tel (int): 35429
>>> Information Systems Services Tel (ext): +44(0)113 343 5429
>>> <tel:%2B44%280%29113%20343%**205429>
>>>
>>> University of Leeds, LS2 9JT, UK
>>> ------------------------------**__----------------------------**
>>> --__-----
>>> ______________________________**___________________
>>> users mailing list
>>> [email protected] <mailto:[email protected]>
>>>
>>> https://gridengine.org/__**mailman/listinfo/users<https://gridengine.org/__mailman/listinfo/users>
>>>
>>> <https://gridengine.org/**mailman/listinfo/users<https://gridengine.org/mailman/listinfo/users>
>>> >
>>>
>>>
>>>
>>
>> --
>> --
>> Dr Orlando Richards
>> Information Services
>> IT Infrastructure Division
>> Unix Section
>> Tel: 0131 650 4994
>>
>> The University of Edinburgh is a charitable body, registered in Scotland,
>> with registration number SC005336.
>>
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users