Re: Multiple copies of cyr_expire running

2009-04-28 Thread Gary Mills
On Wed, Apr 29, 2009 at 10:12:03AM +1000, Bron Gondwana wrote:
> On Tue, Apr 28, 2009 at 01:55:01PM -0500, Gary Mills wrote:
> > On Tue, Apr 28, 2009 at 02:10:02PM -0400, Adam Tauno Williams wrote:
> > > On Tue, 2009-04-28 at 08:13 -0500, Gary Mills wrote:
> > > > I notice that there are two of these running today:
> > > > $ ps -fp "$(pgrep cyr_expire)"
> > > >  UID   PID  PPID   CSTIME TTY TIME CMD
> > > >cyrus  2510   986   3 04:00:01 ? 219:28 cyr_expire -E 3
> > > >cyrus 18280   986   3   Apr 27 ?1580:15 cyr_expire -E 3
> > > > There are also lots of errors like this.  They refer to the same
> > > > message over and over again:
> > > > Apr 28 08:07:56 castor cyr_expire[18280]: [ID 264569 local6.error] 
> > > > DBERROR: mydelete: error deleting 
> > > > <200904201356.n3kdujes008...@taygeta.cc.umanitoba.ca>: DB_NOTFOUND: No 
> > > > matching key/data pair found
> 
> Bloody BDB.  I wish I understood it better.  Lots of people use it, so
> it seems it must be something odd Cyrus does that causes it to be
> relatively unreliable...

Yes, I hate that one too!  It's the only one.  The others are all
skiplist or flat.

> > > > Should I kill one of the cyr_expire processes?  Is there a safe way
> > > > to do this?  
> > > 
> > > I'd kill -15 both of them.  Then watch to see if they get stuck again.
> > 
> > I did that last time around, with bad results.  POP3 stopped working.
> > I had to restart master to fix that.
>  
> Odd - it shouldn't.  I have killed cyr_expire without problems before.

I didn't expect a problem either.

> Then again, we only run it once per week, so it never wraps!
> 
> > > > Is the duplicate delivery database broken?  Is there a
> > > > way to fix it?
> > > 
> > > There is no reason to fix it; I'd just delete it.  You maybe will be a
> > > couple duplicates but no big deal.
> > 
> > I thought that some information need by the sieve vacation responder
> > was stored in that database.  I don't want to break that feature for
> > thousands of people.
> 
> It may send a vacation response again.  All it stores is the "vacation
> already sent" data.

Okay, that's likely what I'll do.  I'll try your cyr_dbtool first to
see if it can delete that index entry.

> I would restart the master while deleting it though.
> Bron ( yes, that does kick off all your users... )

Yep.  Most e-mail clients seem to reconnect quickly, so it shouldn't
be too bad.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Multiple copies of cyr_expire running

2009-04-28 Thread Bron Gondwana
On Tue, Apr 28, 2009 at 01:55:01PM -0500, Gary Mills wrote:
> On Tue, Apr 28, 2009 at 02:10:02PM -0400, Adam Tauno Williams wrote:
> > On Tue, 2009-04-28 at 08:13 -0500, Gary Mills wrote:
> > > I notice that there are two of these running today:
> > > $ ps -fp "$(pgrep cyr_expire)"
> > >  UID   PID  PPID   CSTIME TTY TIME CMD
> > >cyrus  2510   986   3 04:00:01 ? 219:28 cyr_expire -E 3
> > >cyrus 18280   986   3   Apr 27 ?1580:15 cyr_expire -E 3
> > > There are also lots of errors like this.  They refer to the same
> > > message over and over again:
> > > Apr 28 08:07:56 castor cyr_expire[18280]: [ID 264569 local6.error] 
> > > DBERROR: mydelete: error deleting 
> > > <200904201356.n3kdujes008...@taygeta.cc.umanitoba.ca>: DB_NOTFOUND: No 
> > > matching key/data pair found

Bloody BDB.  I wish I understood it better.  Lots of people use it, so
it seems it must be something odd Cyrus does that causes it to be
relatively unreliable...

> > > Should I kill one of the cyr_expire processes?  Is there a safe way
> > > to do this?  
> > 
> > I'd kill -15 both of them.  Then watch to see if they get stuck again.
> 
> I did that last time around, with bad results.  POP3 stopped working.
> I had to restart master to fix that.
 
Odd - it shouldn't.  I have killed cyr_expire without problems before.

Then again, we only run it once per week, so it never wraps!

> > > Is the duplicate delivery database broken?  Is there a
> > > way to fix it?
> > 
> > There is no reason to fix it; I'd just delete it.  You maybe will be a
> > couple duplicates but no big deal.
> 
> I thought that some information need by the sieve vacation responder
> was stored in that database.  I don't want to break that feature for
> thousands of people.

It may send a vacation response again.  All it stores is the "vacation
already sent" data.

I would restart the master while deleting it though.

Bron ( yes, that does kick off all your users... )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Multiple copies of cyr_expire running

2009-04-28 Thread Gary Mills
On Tue, Apr 28, 2009 at 02:10:02PM -0400, Adam Tauno Williams wrote:
> On Tue, 2009-04-28 at 08:13 -0500, Gary Mills wrote:
> > I notice that there are two of these running today:
> > $ ps -fp "$(pgrep cyr_expire)"
> >  UID   PID  PPID   CSTIME TTY TIME CMD
> >cyrus  2510   986   3 04:00:01 ? 219:28 cyr_expire -E 3
> >cyrus 18280   986   3   Apr 27 ?1580:15 cyr_expire -E 3
> > There are also lots of errors like this.  They refer to the same
> > message over and over again:
> > Apr 28 08:07:56 castor cyr_expire[18280]: [ID 264569 local6.error] 
> > DBERROR: mydelete: error deleting 
> > <200904201356.n3kdujes008...@taygeta.cc.umanitoba.ca>: DB_NOTFOUND: No 
> > matching key/data pair found
> > Should I kill one of the cyr_expire processes?  Is there a safe way
> > to do this?  
> 
> I'd kill -15 both of them.  Then watch to see if they get stuck again.

I did that last time around, with bad results.  POP3 stopped working.
I had to restart master to fix that.

> > Is the duplicate delivery database broken?  Is there a
> > way to fix it?
> 
> There is no reason to fix it; I'd just delete it.  You maybe will be a
> couple duplicates but no big deal.

I thought that some information need by the sieve vacation responder
was stored in that database.  I don't want to break that feature for
thousands of people.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Multiple copies of cyr_expire running

2009-04-28 Thread Adam Tauno Williams
On Tue, 2009-04-28 at 08:13 -0500, Gary Mills wrote:
> I notice that there are two of these running today:
> $ ps -fp "$(pgrep cyr_expire)"
>  UID   PID  PPID   CSTIME TTY TIME CMD
>cyrus  2510   986   3 04:00:01 ? 219:28 cyr_expire -E 3
>cyrus 18280   986   3   Apr 27 ?1580:15 cyr_expire -E 3
> There are also lots of errors like this.  They refer to the same
> message over and over again:
> Apr 28 08:07:56 castor cyr_expire[18280]: [ID 264569 local6.error] 
> DBERROR: mydelete: error deleting 
> <200904201356.n3kdujes008...@taygeta.cc.umanitoba.ca>: DB_NOTFOUND: No 
> matching key/data pair found
> Should I kill one of the cyr_expire processes?  Is there a safe way
> to do this?  

I'd kill -15 both of them.  Then watch to see if they get stuck again.

> Is the duplicate delivery database broken?  Is there a
> way to fix it?

There is no reason to fix it; I'd just delete it.  You maybe will be a
couple duplicates but no big deal.
-- 
OpenGroupware developer: awill...@whitemice.org

OpenGroupare & Cyrus IMAPd documenation @



Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Multiple copies of cyr_expire running

2009-04-28 Thread Gary Mills
I notice that there are two of these running today:

$ ps -fp "$(pgrep cyr_expire)"
 UID   PID  PPID   CSTIME TTY TIME CMD
   cyrus  2510   986   3 04:00:01 ? 219:28 cyr_expire -E 3
   cyrus 18280   986   3   Apr 27 ?1580:15 cyr_expire -E 3

There are also lots of errors like this.  They refer to the same
message over and over again:

Apr 28 08:07:56 castor cyr_expire[18280]: [ID 264569 local6.error] DBERROR: 
mydelete: error deleting <200904201356.n3kdujes008...@taygeta.cc.umanitoba.ca>: 
DB_NOTFOUND: No matching key/data pair found

Should I kill one of the cyr_expire processes?  Is there a safe way
to do this?  Is the duplicate delivery database broken?  Is there a
way to fix it?

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html