On 2010-06-02, at 11:54 AM, Chuck Hill wrote:
>
> On Jun 2, 2010, at 8:51 AM, Pascal Robert wrote:
>
>>
>> Le 10-06-02 à 11:39, Chuck Hill a écrit :
>>
>>>
>>> On Jun 2, 2010, at 8:16 AM, Pascal Robert wrote:
>>>
>>>>
>>>> Le 10-06-02 à 10:30, Chuck Hill a écrit :
>>>>
>>>>> That makes your code look guilty then. :-)
>>>>
>>>> Funny thing is that he not really my code (eg, I didn't write it) but this
>>>> is code dated from WO 5.2. It's just that this app never had that much
>>>> traffic.
>>>>
>>>> And I did try stress loading this app with JMeter, but since the URL is
>>>> changed when the long response page is called (session ID is put back in
>>>> the URL) and I don't know how to fix this, that part was not stress loaded.
>>>>
>>>>> Check your long response page implementation again. Are there any
>>>>> exceptions in the log that might be related?
>>>>
>>>> Just to explain a bit more :
>>>>
>>>> - It's a (non public) online store. When people log in, we create a order
>>>> in memory and customers add order items to the order. We don't store
>>>> anything in the DB until the payment is made with PayFlow. When we get the
>>>> response from PayFlow, we store a copy of the order (and the items) to our
>>>> Oracle db. After that, we contact our SQL Server db (actually, a
>>>> accounting system, and we send the data to a stored procedure), and we get
>>>> the invoice number produced by the accounting system and store it in the
>>>> order EO in Oracle.
>>>>
>>>> So in summary :
>>>>
>>>> - People login, we create a order EO, the EO is created in the session's
>>>> editing context
>>>> - People add items to the order
>>>> - They start the order payment steps
>>>> - Long response page kicks in
>>>> - We contact PayFlow to make the payment
>>>> - If the payment is succesful, we store the order in Oracle
>>>> - We create a new EO, in a different EC, for SQL Server
>>>> - We update the order EO to store the invoice number in Oracle
>>>> - We generate (FOXML, generated in a separated JVM) the invoice in PDF
>>>> - Long response page is done, pageForResult is called
>>>>
>>>> Everything is done in session.defaultEditingContext EXCEPT the SQL Server
>>>> EOs,
>>>
>>> You are not using the session.defaultEditingContext in the long response
>>> page, are you? I am pretty sure that is an excellent source of deadlocks.
>>
>> Hum, yes we do use in the long response page... But since
>> localInstanceOfObject won't let me have a copy in a new EC, what are the
>> options except not using the session EC?
>
> Not using the session EC would be a good choice. Make a different EC. Pass
> it into the long response page. Be careful handing off locking.
>
> You could also save the order in an "unpaid" state, then fetch it in the long
> response page and update it if paid, or delete it if not.
Ooh, yeah, you could do that too.
>
>
> Chuck
>
>
>>
>>>
>>> Chuck
>>>
>>>> where we create a new EOObjectStore, create a new EOEditingContext inside
>>>> the new object store, and
>>>>
>>>> EOObjectStore osc = new EOObjectStoreCoordinator();
>>>> EOEditingContext ec = new EOEditingContext(osc);
>>>> ec.lock();
>>>> try {
>>>> CommandesEcom commandeEcom =
>>>> CommandesEcom.creerCommandesEcom(ec);
>>>> ...
>>>> ec.saveChanges();
>>>> finally {
>>>> ec.unlock();
>>>> ec.dispose();
>>>> osc.dispose();
>>>> ec = null;
>>>> osc = null;
>>>> }
>>>>
>>>> A co-worker suggested that we create a new editing context in the long
>>>> response page, and call EOUtilities.localInstanceOfObject to have a copy
>>>> of the order EO in the new EC, but the resulting EO is null, even if the
>>>> source is not.
>>>>
>>>>> I'd also reduce the Maximum Adaptor threads (JavaMonitor -> Application
>>>>> configuration -> Application settings). 6 or 8 is probably more than
>>>>> enough for this app. That will at least reduce the size of the thread
>>>>> dumps. I'd also trim down the listen queue size to 2 or 4, might as
>>>>> well catch this as soon as possible.
>>>>>
>>>>> Chuck
>>>>>
>>>>>
>>>>>
>>>>> On Jun 2, 2010, at 5:19 AM, Pascal Robert wrote:
>>>>>
>>>>>> ... And going back to the physical server didn't solve anything, I got
>>>>>> the same deadlock this morning.
>>>>>>
>>>>>>> Ok, so I will move back the DB to the physical server to see if the
>>>>>>> problem goes away.
>>>>>>>
>>>>>>>>
>>>>>>>> On Jun 1, 2010, at 6:34 AM, Pascal Robert wrote:
>>>>>>>>
>>>>>>>>> Hum... And after I started using ERXWOLongResponsePage, I still got a
>>>>>>>>> deadlock, but this time, it says that it's a EODatabaseContext lock :
>>>>>>>>>
>>>>>>>>> Thread t...@92163: (state = BLOCKED)
>>>>>>>>> - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
>>>>>>>>> - java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
>>>>>>>>> - com.webobjects.foundation.NSRecursiveLock.lock() @bci=54, line=72
>>>>>>>>> (Interpreted frame)
>>>>>>>>> - com.webobjects.eoaccess.EODatabaseContext.lock() @bci=56, line=1973
>>>>>>>>> (Interpreted frame)
>>>>>>>>> -
>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
>>>>>>>>> @bci=5, line=130 (Interpreted frame)
>>>>>>>>> -
>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
>>>>>>>>> @bci=34, line=166 (Interpreted frame)
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> We don't "manual" (eg , in code) locking at the EODatabaseContext
>>>>>>>>> level.
>>>>>>>>
>>>>>>>> It is possible that an odd exception in EOAccess or below is resulting
>>>>>>>> in this not getting unlocked. Joe's reply below might be what is
>>>>>>>> happening to you.
>>>>>>>>
>>>>>>>> Chuck
>>>>>>>>
>>>>>>>>
>>>>>>>>> Another thing to note if this is a long request to a database housed
>>>>>>>>> in an ESX vm. We had similar problems with long requests timing out
>>>>>>>>> between two systems, with one hosted by esx 4.x. Such long requests
>>>>>>>>> were caught by some low level interface muxing issue and my whole EOF
>>>>>>>>> stack was frozen when the underlying db connection was lost
>>>>>>>>> mid-transaction. I resolved it by moving this application off of a vm.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On May 31, 2010, at 5:33 PM, Pascal Robert <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ok, will try with ERXWOLongResponsePage since it look like it's
>>>>>>>>>>> locking and unlocking all ECs in the thread.
>>>>>>>>>>>
>>>>>>>>>>>> There's a bunch of stuff wrong here. First, the only actually
>>>>>>>>>>>> locked thread is:
>>>>>>>>>>>>
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.addCooperatingObjectStore(com.webobjects.eocontrol.EOCooperatingObjectStore)
>>>>>>>>>>>> @bci=5, line=130 (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.setCurrentEditingContext(com.webobjects.eocontrol.EOEditingContext)
>>>>>>>>>>>> @bci=34, line=166 (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel._selectWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=158, line=788
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseChannel.selectObjectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=64, line=215
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseContext._objectsWithFetchSpecificationEditingContext(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=219, line=3205
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eoaccess.EODatabaseContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=34, line=3346
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=97, line=539
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4114
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext) @bci=72, line=1211
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification)
>>>>>>>>>>>> @bci=3, line=4500 (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Licence.fetchLicences(com.webobjects.eocontrol.EOEditingContext,
>>>>>>>>>>>> com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>>> com.webobjects.foundation.NSArray) @bci=19, line=1062 (Interpreted
>>>>>>>>>>>> frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>>> com.webobjects.foundation.NSArray, boolean) @bci=77, line=8920
>>>>>>>>>>>> (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore._Membre.licences(com.webobjects.eocontrol.EOQualifier,
>>>>>>>>>>>> boolean) @bci=4, line=8893 (Interpreted frame)
>>>>>>>>>>>> -
>>>>>>>>>>>> com.acaiq.fondation.acaiqCore.Membre.licencesParEtats(com.acaiq.fondation.acaiqCore.EtatMembre[])
>>>>>>>>>>>> @bci=100, line=980 (Interpreted frame)
>>>>>>>>>>>> - com.acaiq.fondation.acaiqCore.Membre.licencesValides() @bci=11,
>>>>>>>>>>>> line=996 (Interpreted frame)
>>>>>>>>>>>> - com.acaiq.fondation.acaiqCore.Membre.estCourtier() @bci=5,
>>>>>>>>>>>> line=1035 (Interpreted frame)
>>>>>>>>>>>> - sun.reflect.GeneratedMethodAccessor87.invoke(java.lang.Object,
>>>>>>>>>>>> java.lang.Object[]) @bci=40 (Interpreted frame)
>>>>>>>>>>>>
>>>>>>>>>>>> Which reminds me of an unlocked EC/OSC. Second:
>>>>>>>>>>>>
>>>>>>>>>>>>> java.lang.IllegalArgumentException: Attribute noCommandeOracle
>>>>>>>>>>>>> can't receive a null parameter :
>>>>>>>>>>>>> at
>>>>>>>>>>>>> com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
>>>>>>>>>>>>
>>>>>>>>>>>> This is a *template* that throws on null?? You sure that's such a
>>>>>>>>>>>> bright idea? Isn't this what validation is for? And third:
>>>>>>>>>>>>
>>>>>>>>>>>>> at
>>>>>>>>>>>>> com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)
>>>>>>>>>>>>
>>>>>>>>>>>> As you're throwing from inside a normal
>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage, I seriously hope
>>>>>>>>>>>> you're doing your part of try{} finally{} and EC unlocking.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers, Anjo
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 31.05.2010 um 20:02 schrieb Pascal Robert:
>>>>>>>>>>>>
>>>>>>>>>>>>> One of our apps have deadlocked 5 times over 3 days, strangely
>>>>>>>>>>>>> enough it started when we moved our Oracle Database 10gR2 DB to
>>>>>>>>>>>>> our VMWare ESX 4.0 cluster. e didn't re-install Oracle, I simply
>>>>>>>>>>>>> did a P2V (Physical to VM) conversion, so it's the exact same
>>>>>>>>>>>>> version of Oracle DB as before.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What's happenning is that we store some information on our Oracle
>>>>>>>>>>>>> database, save it, and we built a copy of some of the data to a
>>>>>>>>>>>>> new EO (different entity) in a SQL Server 2005 db so the
>>>>>>>>>>>>> accounting system take care of billing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The exception that cause the deadlock (or at least the last thing
>>>>>>>>>>>>> written to the log before the deadlock) :
>>>>>>>>>>>>>
>>>>>>>>>>>>> java.lang.IllegalArgumentException: Attribute noCommandeOracle
>>>>>>>>>>>>> can't receive a null parameter :
>>>>>>>>>>>>> at
>>>>>>>>>>>>> com.acaiq.fondation.depot.lbaArticle._CommandesEcom.setNoCommandeOracle(_CommandesEcom.java:419)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> com.acaiq.fondation.depot.Caissier.copiePourLBA(Caissier.java:267)
>>>>>>>>>>>>> at com.acaiq.fondation.depot.Caissier.paye(Caissier.java:137)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> com.acaiq.depot.component.TransactionAchat.performAction(TransactionAchat.java:63)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> com.webobjects.woextensions.WOLongResponsePage.run(WOLongResponsePage.java:119)
>>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:613)
>>>>>>>>>>>>>
>>>>>>>>>>>>> And it happens here :
>>>>>>>>>>>>>
>>>>>>>>>>>>> commandeEcom.setNoCommandeOracle(((Integer)
>>>>>>>>>>>>> _commande.clefsPrimaire()));
>>>>>>>>>>>>>
>>>>>>>>>>>>> _commande.clefsPrimaire() is a method that simply do :
>>>>>>>>>>>>>
>>>>>>>>>>>>> return ERXEOControlUtilities.primaryKeyObjectForObject(this);
>>>>>>>>>>>>>
>>>>>>>>>>>>> So clefsPrimaire() returns null, even if the data was stored in
>>>>>>>>>>>>> the Oracle DB (and it's really there) and a primary key was
>>>>>>>>>>>>> generated (EOF did it, it's not a "human generated" PK).
>>>>>>>>>>>>>
>>>>>>>>>>>>> The whole block :
>>>>>>>>>>>>>
>>>>>>>>>>>>> try {
>>>>>>>>>>>>> _commande.setEstPaye(true);
>>>>>>>>>>>>> _commande.editingContext().saveChanges(); // This is when we
>>>>>>>>>>>>> save the EO in Oracle
>>>>>>>>>>>>> } catch (Exception e) {
>>>>>>>>>>>>> NSLog.err.appendln(e.getMessage());
>>>>>>>>>>>>>
>>>>>>>>>>>>> remboursementPayflow((Transaction)pfd.valueForKey("transaction"),_commande,
>>>>>>>>>>>>> (String) pfd.valueForKey("PNREF"));
>>>>>>>>>>>>> pfd = new NSDictionary<Object, String>(CODE_RESUTAT,
>>>>>>>>>>>>> "REMBOURSEMENT");
>>>>>>>>>>>>> } finally {
>>>>>>>>>>>>> try {
>>>>>>>>>>>>> copiePourLBA(_commande); // This is the method where
>>>>>>>>>>>>> we copy some data to SQL Server
>>>>>>>>>>>>> } catch (Exception e) {
>>>>>>>>>>>>> NSLog.err.appendln(e.getMessage());
>>>>>>>>>>>>> }
>>>>>>>>>>>>> _commande.editingContext().saveChanges();
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> This problem didn't happen in the past (but we also had less
>>>>>>>>>>>>> requests coming him) and it doesn't always happen :-/ jstack give
>>>>>>>>>>>>> me tons of this :
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thread t...@71683: (state = BLOCKED)
>>>>>>>>>>>>> - java.net.PlainSocketImpl.accept(java.net.SocketImpl) @bci=0,
>>>>>>>>>>>>> line=382 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.implAccept(java.net.Socket) @bci=50,
>>>>>>>>>>>>> line=450 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.accept() @bci=48, line=421 (Interpreted
>>>>>>>>>>>>> frame)
>>>>>>>>>>>>> - com.webobjects.appserver._private.WOWorkerThread.run() @bci=26,
>>>>>>>>>>>>> line=238 (Interpreted frame)
>>>>>>>>>>>>> - java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
>>>>>>>>>>>>>
>>>>>>>>>>>>> The only non-blocked thread is :
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thread t...@78083: (state = IN_NATIVE)
>>>>>>>>>>>>> - java.net.PlainSocketImpl.socketAccept(java.net.SocketImpl)
>>>>>>>>>>>>> @bci=0 (Interpreted frame)
>>>>>>>>>>>>> - java.net.PlainSocketImpl.accept(java.net.SocketImpl) @bci=7,
>>>>>>>>>>>>> line=384 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.implAccept(java.net.Socket) @bci=50,
>>>>>>>>>>>>> line=450 (Interpreted frame)
>>>>>>>>>>>>> - java.net.ServerSocket.accept() @bci=48, line=421 (Interpreted
>>>>>>>>>>>>> frame)
>>>>>>>>>>>>> - com.webobjects.appserver._private.WOWorkerThread.run() @bci=26,
>>>>>>>>>>>>> line=238 (Interpreted frame)
>>>>>>>>>>>>> - java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Beside going back to our Oracle physical server, I have no idea
>>>>>>>>>>>>> of why I'm getting this. Java 1.5 on OS X 10.4.11 Server, WO
>>>>>>>>>>>>> 5.3.3. W
>>>>>>>>>>>>>
>>>>>>>>>>>>> <pid23991.txt>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----
>>>>>>>>>>>>> Pascal Robert
>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>
>>>>>>>>>>>>> AIM: MacTICanada
>>>>>>>>>>>>> Twitter : MacTICanada
>>>>>>>>>>>>> LinkedIn : http://www.linkedin.com/in/macti
>>>>>>>>>>>>> WO Community profile :
>>>>>>>>>>>>> http://wocommunity.org/page/member?name=probert
>>>>>
>>>>> --
>>>>> Chuck Hill Senior Consultant / VP Development
>>>>>
>>>>> Practical WebObjects - for developers who want to increase their overall
>>>>> knowledge of WebObjects or who are trying to solve specific problems.
>>>>> http://www.global-village.net/products/practical_webobjects
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> Chuck Hill Senior Consultant / VP Development
>>>
>>> Practical WebObjects - for developers who want to increase their overall
>>> knowledge of WebObjects or who are trying to solve specific problems.
>>> http://www.global-village.net/products/practical_webobjects
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
> --
> Chuck Hill Senior Consultant / VP Development
>
> Practical WebObjects - for developers who want to increase their overall
> knowledge of WebObjects or who are trying to solve specific problems.
> http://www.global-village.net/products/practical_webobjects
>
>
>
>
>
>
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-dev mailing list ([email protected])
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/webobjects-dev/dleber_wodev%40codeferous.com
>
> This email sent to [email protected]
;david
--
David LeBer
Codeferous Software
'co-def-er-ous' adj. Literally 'code-bearing'
site: http://codeferous.com
blog: http://davidleber.net
profile: http://www.linkedin.com/in/davidleber
twitter: http://twitter.com/rebeld
--
Toronto Area Cocoa / WebObjects developers group:
http://tacow.org
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com
This email sent to [email protected]