Hi Jocelyn,
Please check the latest dev build:
http://modis.ispras.ru/FTPContent/sedna/development/
It should fix the bug.
Thank you for your feedback and using Sedna!
Ivan Shcheklein,
Sedna Team
On Wed, Dec 21, 2011 at 7:49 PM, Raymond, Jocelyn <
jocelyn.raym...@ualberta.ca> wrote:
> **
> Hi Ivan,
> Thanks for the shell script below. I have tried and discovered the
> cause. I don't think there is anything wrong with Sedna and probably one
> of my mistake.
> The problem appear to come from my hot backups. When I disabled the hot
> backups, the lsof showed no increases (my development box). When I
> re-enabled my hot backups shell script, it increased by one every time it
> ran (I ran the hot backup every hour). I have attached my script but the
> line that seems to cause this is in red below. So every time the -incr-mode
> add is called, the lsof showed the increase of one and never went back
> down. Even after I called the -incr-mode stop it didn't go down. It
> also increased by two when -incr-mode start was called. Am I missing
> something? Is there something else I should be calling when I want to
> terminate a hot backup cycle? Do you have any suggestions?
> Thanks for your help
>
>
> incr(){
> if [ ${#INCR_DIRECTORY} -ne 0 ]; then
> $SEDNA_HB -time-dir -incr-mode add $DBNAME $INCR_DIRECTORY
> if [ $? -ne 0 ]; then
> email "Sedna Hot Backup (increment) *failed* today (see
> $INCR_DIRECTORY)" "SEDNA HOT BACKUPS - FAILURE" $EMAILS
> exit -1
> fi
> else
> email "Sedna Hot Backup (increment) *failed* today because the cycle
> wasn't previously started (see $STORED_PATH_IN)" "SEDNA HOT BACKUPS -
> FAILURE" $EMAILS
> fi
>
> }
>
>
> ------------------------------
> *From:* Ivan Shcheklein [mailto:shchekl...@gmail.com]
> *Sent:* December 19, 2011 1:23 PM
> *To:* Raymond, Jocelyn
> *Cc:* se...@ispras.ru
>
> *Subject:* Re: [Sedna-discussion] Sedna 3.4.66 (Linux) FATAL issue
>
> Ok, thanks for the data. Though, we still don't know the reason.
> Everything seems ok.
>
> How long do you use 3.4.66? Have you seen this problem before?
>
> Let's monitor se_gov for a while to collect number of open files at least.
>
> 1. Create /tmp/monitor.sh (vim monitor.sh)
> 2. Copy the following script inside this file. It prints once in 10
> seconds line: "current date number of open file by se_gov process
> number of total open files".
>
> #!/bin/sh -e
>
> GOV_PID=`pgrep se_gov`
>
> while true; do
> echo `date` `lsof -n -p $GOV_PID | wc -l` `lsof -n | wc -l`
> sleep 10
> done
>
> 3. Make it executable. And check that it works:
>
> chmod +x /tmp/monitor.sh
> /tmp/monitor.sh
>
> it should print something like:
>
> Tue Dec 20 00:17:43 MSK 2011 81 2087
>
> 4. Make sure that Sedna is running (se_gov process exists)
>
> 5. Start it in background:
>
> nohup /tmp/monitor.sh >/tmp/log.out 2>&1 &
>
> 6. Make sure it's running properly (check that monitor.sh process exists
> and log.out is being appended).
>
> Then we have to wait until next error happens or log.out (you can check it
> with tail -f /tmp/log.out command) contains huge numbers.
>
> Anyway copy and send us log.out after one day.
>
> P.S. Don't forget to kill monitor.sh process after the experiment is
> finished ). And let us know if you have problems with running that script.
>
> On Mon, Dec 19, 2011 at 8:02 PM, Raymond, Jocelyn <
> jocelyn.raym...@ualberta.ca> wrote:
>
>> **
>> From your instructions, I got the following (but don't really know how to
>> analyse it):
>> [sedna data]# ulimit -n
>> 1024
>> [sedna data]# sysctl -n fs.file-max
>> 203350
>> [sedna data]# se_stop
>> SEDNA server has been shut down successfully
>> [sedna data]# sudo lsof | wc -l
>> 824
>>
>> Also, this very morning@06:11:02, our development sedna (same setup as
>> prod) went down for what appear to be the same reason. Here is a sample of
>> our log (attached). Here is the same command executed on our development
>> machine:
>> [raja sedna]# ulimit -n
>> 1024
>> [raja sedna]# sysctl -n fs.file-max
>> 206005
>> [raja sedna]# se_stop
>> SEDNA server has been shut down successfully
>> [raja sedna]# sudo lsof | wc -l
>> 2533
>> Let me know if you have any insights on this problem.
>> Thank you again,
>> Jocelyn
>> University of Alberta
>>
>> ------------------------------
>> *From:* Ivan Shcheklein [mailto:shchekl...@gmail.com]
>> *Sent:* December 15, 2011 12:45 PM
>> *To:* Raymond, Jocelyn
>> *Cc:* sedna-discussion@lists.sourceforge.net
>> *Subject:* Re: [Sedna-discussion] Sedna 3.4.66 (Linux) FATAL issue
>>
>> Hi Jocelyn,
>>
>>
>> We have been using Sedna for quite some time now (since version 0.6
>>> :-).
>>>
>>
>> Yep. We remember this :) Thank you for using it!
>>
>>
>>> SYS 15/12/2011 09:10:01 (GOV pid=24200) [ushm.c:uOpenShMem:71]:
>>> shm_open (code = 24): Too many open files
>>> FATAL 15/12/2011 09:10:01 (GOV pid=24200)
>>> [hb_funcs.cpp:hbSendMsgToSm:64]: Failed to initialize SSMMsg service
>>> (message service)
>>>
>>
>> First of all, we need to determine the actual cause of that problem
>> (Sedna or some other process which holds too many open files).
>>
>> Please, run the following (under Sedna user):
>>
>> ulimit -n
>> sysctl -n fs.file-max
>>
>> se_stop
>>
>> sudo lsof | wc -l
>>
>>
>>
>>> I noticed that you have version 3.5 released. I will probably upgrade
>>> our devl and prod early in 2012 unless you believe that this problem could
>>> go away if I upgrade right now.
>>>
>>
>> We didn't fix any file descriptors leaks. Though, it's better to upgrade
>> it anyway. It's faster, more stable, etc :)
>>
>>
>> Ivan Shcheklein,
>> Sednan Team
>>
>
>
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Sedna-discussion mailing list
Sedna-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sedna-discussion