anatoly techtonik wrote:
> On Sun, Feb 8, 2009 at 1:32 PM, Christian Boos <[email protected]> wrote:
>   
>>> Any ideas why Trac may stop syncing cache with repository after
>>> upgrading to 0.11.2.1?
>>> The log is full of strange messages about insufficient privileges for
>>> creating pages.
>>>
>>> 2009-02-07 17:55:11,224 Trac[__init__] DEBUG: Dispatching <Request
>>> "GET u'/timeline'">
>>> 2009-02-07 17:55:11,402 Trac[__init__] DEBUG: Subversion bindings imported
>>> 2009-02-07 17:55:11,417 Trac[__init__] INFO: repos rev [2557] !=
>>> cached rev [2414]
>>>
>>>       
>> I've already seen that once on someone's machine (hi John ;-) ), without
>> having a clue about it.
>> So it would be very helpful if you could spend some time debugging this
>> in depth.
>> By inserting enough debug statements in trac/versioncontrol/cache.py,
>> you should be able to get a clue about what's going on (e.g. missing
>> read permission on the svn repo?).
>>     
>
> It turned out to be locked cache after timeout. At first I thought
> about a race condition, but it was false. The cache can easily be
> jammed with a long resync from repository on a busy server. It took
> about 40 seconds to sync 200 revisions before timeout occurred. Take a
> cache.py sync()
> http://trac.edgewall.org/browser/branches/0.11-stable/trac/versioncontrol/cache.py#L158
>
> On line 159 we check if a revision is being entered in DB by looking
> if next revision number is already present in `revisions` table. That
> means another thread is made an INSERT in line 184. The cache is now
> locked. To unlock it, another thread should update youngest revision
> at line 221.
>   
Correct.
> So, the problem is when a timeout occurs during execution of 202:221 -
> there is an update in `node_change` table that can potentially take a
> long time to experience a timeout (esp. when server is under a heavy
> load). There are also cset.get_changes() and
> self.repos.next_rev(next_youngest) calls that can contribute to the
> issue. 

I don't really see how a time out can happen here. Do you mean a 
TimeoutError exception?
That shouldn't happen, as the "db" variable already refers to a 
connection, and TimeoutError exceptions are only raised when trying to 
get a connection from the cache.

But you're nevertheless right, if anything wrong happens there (unicode 
exception, db level timeout?), then there's no rollback done here.
That exception ends up being ignored at a later point (converted to a 
warning) and if this is followed by a commit (e.g. session save), then 
the cache remains locked.

> I haven't found other way to recover than to manually patch DB.
>
> DELETE FROM `node_change` WHERE `rev` > 2260;
> DELETE FROM `revision` WHERE `rev` > 2260;
> UPDATE `system` SET `value` = 2260 WHERE `name` = 'youngest_rev' LIMIT 1;
>
> Seems like the only way to avoid locking is to introduce timeout on
> 184:221 cache operations. Like additional `sync_time_started` = None
> or something else as an explicit flag.
>   

Rather simply doing a rollback in all cases when something goes.
I'll work on a patch.

>   
>>> 2009-02-07 17:55:11,424 Trac[__init__] DEBUG: Retrieving session for
>>> ID '4b0a7707907d96a01f8ff7e0'
>>> 2009-02-07 17:55:11,803 Trac[__init__] DEBUG: Prepare chrome data for 
>>> request
>>> 2009-02-07 17:55:11,805 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing TRAC_ADMIN on None
>>> 2009-02-07 17:55:11,806 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing PERMISSION_GRANT on None
>>> 2009-02-07 17:55:11,806 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing PERMISSION_REVOKE on None
>>> 2009-02-07 17:55:11,336 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing EMAIL_VIEW on None
>>> 2009-02-07 17:55:12,509 Trac[__init__] DEBUG: Updating wiki page index
>>> 2009-02-07 17:55:12,511 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing WIKI_CREATE on <Resource u'wiki:FreeFindData'>
>>> 2009-02-07 17:55:12,013 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing WIKI_CREATE on <Resource u'wiki:KeyBarLabels'>
>>> 2009-02-07 17:55:12,746 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing WIKI_CREATE on <Resource u'wiki:GoToFile'>
>>> 2009-02-07 17:55:12,266 Trac[__init__] DEBUG: No policy allowed
>>> anonymous performing WIKI_CREATE on <Resource u'wiki:GlobalLock'>
>>> 2009-02-07 17:55:12,308 Trac[__init__] DEBUG: No policy allowed
>>> <...>
>>>       
>> That's puzzling as well. Do you have some random page generation plugin
>> installed? :-)
>> It might be a bot trying to visit links to non-existing Wiki pages, try
>> to set `[wiki] ignore_missing_pages = true` in your trac.ini file...
>>     
>
> I thought about a bot too, because page names are different every
> time. These must be references from Source Browser, because Timeline
> references contain <a> tags without href attributes. Perhaps these
> were added from another thread while I was debugging my Timeline view.
> It would be more useful to see pid or thread id displayed in log
> instead of Trac[__init__] string. Does this Trac[__init__] change at
> all?
>   

Have a look at http://trac.edgewall.org/wiki/TracLogging#LogFormat and the
"The format is a string which can contain any of the Python logging 
Formatter variables. " part.
You can indeed add the thread id there.

Thanks for you in-depth analysis of the problem, this might solve one of 
the "weird" issues we still carry on since Trac 0.10 (e.g. #5440).

-- Christian


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Trac 
Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/trac-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to