Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-11-05 Thread Ivan Lagunov
Hi Ilya,

Thanks for the hint about jstack. I've analyzed two hanging issues and both
look identical on Java side:

One thread, let's call it A, takes a lock and does not release it:
"ajp-bio-8059-exec-21" daemon prio=10 tid=0x18a3e000 nid=0x4c89
runnable [0x434ee000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at net.cfoster.sedna.xmldb.p.a(Unknown Source)
at net.cfoster.sedna.xmldb.o.a(Unknown Source)
at net.cfoster.sedna.xmldb.o.a(Unknown Source)
at net.cfoster.sedna.xmldb.g.g(Unknown Source)
at net.cfoster.sedna.xmldb.g.(Unknown Source)
at net.cfoster.sedna.xmldb.h.c(Unknown Source)
- locked <0xc1c27948> (a net.cfoster.sedna.xmldb.h)
at net.cfoster.sedna.xmldb.h.a(Unknown Source)
- locked <0xc1c27948> (a net.cfoster.sedna.xmldb.h)
at net.cfoster.sedna.DatabaseImpl.getCollection(Unknown Source)

Another thread, let's call it B, waits for the lock:
"ajp-bio-8059-exec-22" daemon prio=10 tid=0x18c36800 nid=0x4c8e
waiting for monitor entry [0x41135000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at net.cfoster.sedna.xmldb.h.a(Unknown Source)
- waiting to lock <0xc1c27948> (a net.cfoster.sedna.xmldb.h)
at net.cfoster.sedna.DatabaseImpl.getCollection(Unknown Source)
at org.xmldb.api.DatabaseManager.getCollection(DatabaseManager.java:206)

On Sedna side the thread A database connection is processed and COMMITED in
the logs:
[tr_common_funcs.cpp:on_transaction_end:414]: Transaction has been COMMITED
[sp.c:sp_send_msg:64]: Message send : type 250, len 0

The thread B database connection is processed by Sedna and ends up waiting
for the client's response:
[sp.c:sp_send_msg:64]: Message send : type 360, len 1086
[sp.c:sp_send_msg:64]: Message send : type 370, len 0

The client thread B does not respond as it waits for the lock and the
corresponding Sedna connection is eventually closed by XML:DB API
implementation with the message:
Warning: an open Sedna XML:DB Collection/Connection was just automatically
closed due to inactivity, you SHOULD close these manually.

Meanwhile thread A successfully completes the processing and returns the
response while still keeping the lock.

@Ilya, so it's definitely the client-side issue, related either to JDK or
to Charles Foster implementation. I've upgraded to the latest JDK 1.8.0_25
just for a check. If it does not help, I may ask Charles for a paid support
as he suggested.

Best regards,
Ivan Lagunov


On Tue, Oct 28, 2014 at 9:41 PM, Ilya Taranov  wrote:

> Hi, Ivan,
>
>   Unfortunately, I'm too busy at my full-time job currently, so sometimes,
> I just don't have a chance to take a look at your dropbox file during the
> day. Sorry for that.
>
>   What I have seen so far, is that everything confirms my suspicion of the
> client somehow involved. After 27478 sent 370 message, it awaits for
> 310/220/225 from client to continue, holding locks. That's perfectly fine.
> It's hard to tell anything without understanding why client is not willing
> to send it. May that be some kind of hang? I wonder what the client is
> doing, could you take a look at the client backtrace during the hang (like
> the one returned by jstack/sending SIG_QUIT)?
>
>   Index problem you mentioned is a separate issue, I will take a look at.
>
> Thanks,
> Ilya Taranov.
>
>
> On Tue, Oct 28, 2014 at 1:09 PM, Ivan Lagunov  wrote:
>
>> I've faced two more hangings today. Again I'm posting only those
>> connections that were created by our suspected application. Besides, I'm
>> adding the last debug Message from the event.log for each transaction.
>>
>> https://www.dropbox.com/s/istotjr41vrmmd5/20141028001023.ZIP?dl=0
>>
>> se_trn 21097 - fast XQuery, COMMITED, Message recv : type 210, len 0
>> se_trn 21114 - XQuery categorytree:getCategoryTree(), not COMMITED, gdb
>> thread 1 in select, Message send : type 370, len 0
>> se_trn 21127 - fast XQuery, COMMITED, Message recv : type 210, len 0
>> se_trn 21229 - Session is ready
>>
>> So this hanging issue looks completely the same as those of yesterday.
>> However, yesterday's problem has been fixed and now it happens roughly once
>> in several hundreds cases.
>>
>> https://www.dropbox.com/s/ru6r6rplar3fat9/20141028151343.ZIP?dl=0
>>
>> se_trn 27478 - fast XQuery basictypes:getBasicType(), not COMMITED, gdb
>> thread 1 in select, Message send : type 370, len 0
>> se_trn 27932 - fast XQuery, COMMITED, Message recv : type 210, len 0
>> se_trn 28030 - Session is ready
>>
>> Obviously, the issue has not gone. Moreover, the second case confirms
>> that the database can hang on various XQueries as before. As the
>> yesterday's issue looks really the same but just worse, I've come to a
>> conclusion that other hangings may be related to the Java version too. I'm
>> going to try running the application under a different Java version.

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-28 Thread Ilya Taranov
Hi, Ivan,

  Unfortunately, I'm too busy at my full-time job currently, so sometimes,
I just don't have a chance to take a look at your dropbox file during the
day. Sorry for that.

  What I have seen so far, is that everything confirms my suspicion of the
client somehow involved. After 27478 sent 370 message, it awaits for
310/220/225 from client to continue, holding locks. That's perfectly fine.
It's hard to tell anything without understanding why client is not willing
to send it. May that be some kind of hang? I wonder what the client is
doing, could you take a look at the client backtrace during the hang (like
the one returned by jstack/sending SIG_QUIT)?

  Index problem you mentioned is a separate issue, I will take a look at.

Thanks,
Ilya Taranov.

On Tue, Oct 28, 2014 at 1:09 PM, Ivan Lagunov  wrote:

> I've faced two more hangings today. Again I'm posting only those
> connections that were created by our suspected application. Besides, I'm
> adding the last debug Message from the event.log for each transaction.
>
> https://www.dropbox.com/s/istotjr41vrmmd5/20141028001023.ZIP?dl=0
>
> se_trn 21097 - fast XQuery, COMMITED, Message recv : type 210, len 0
> se_trn 21114 - XQuery categorytree:getCategoryTree(), not COMMITED, gdb
> thread 1 in select, Message send : type 370, len 0
> se_trn 21127 - fast XQuery, COMMITED, Message recv : type 210, len 0
> se_trn 21229 - Session is ready
>
> So this hanging issue looks completely the same as those of yesterday.
> However, yesterday's problem has been fixed and now it happens roughly once
> in several hundreds cases.
>
> https://www.dropbox.com/s/ru6r6rplar3fat9/20141028151343.ZIP?dl=0
>
> se_trn 27478 - fast XQuery basictypes:getBasicType(), not COMMITED, gdb
> thread 1 in select, Message send : type 370, len 0
> se_trn 27932 - fast XQuery, COMMITED, Message recv : type 210, len 0
> se_trn 28030 - Session is ready
>
> Obviously, the issue has not gone. Moreover, the second case confirms that
> the database can hang on various XQueries as before. As the yesterday's
> issue looks really the same but just worse, I've come to a conclusion that
> other hangings may be related to the Java version too. I'm going to try
> running the application under a different Java version.
>
> @Ilya, could you comment on the provided event.log with the debug messages
> generated by your patch? I'm curious if it can give better insight into the
> problem.
>
> Best regards,
> Ivan Lagunov
>
> On Mon, Oct 27, 2014 at 5:46 PM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> Finally I've got some clues about Sedna hangings. Today the same
>> application that is under suspicion has been deployed by another developer.
>> It has been built on the developer's machine so apparently the JDK has been
>> different from the JDK that is used for running the application on the
>> server. I'm still waiting for more details on this matter. But this
>> deployment has resulted in a series of Sedna hangings. Although it differs
>> from the hanging scenarios that we had faced before, it may still be
>> somehow related.
>>
>> Here is the archive with logs of the first hanging issue of today:
>>
>> https://www.dropbox.com/s/ufwayhwx7ersy88/20141027131246.ZIP?dl=0
>>
>> These are the transaction processes created by the suspected application:
>> se_trn 17209 - XQuery categorytree:getCategoryTree(), not COMMITED, gdb
>> thread 1 in select
>> se_trn 17217 - Session is ready
>>
>> All other today's hangings have been caused by the same XQuery
>> categorytree:getCategoryTree(). It simply loads and returns a big document
>> TreeExport.xml (about 2.5MB) from the collection "treeStructures/released".
>> It's very likely that the "Sedna index subsystem error" from my previous
>> email is also related to this. There have been no more hangings after I
>> rebuilt and redeployed the Java application on the server using the same
>> "good" JDK that is used for running.
>>
>> @Ilya, could you please look into the event.log and say if the client
>> indeed makes Sedna wait as you had suggested before?
>>
>> This is an extract from event.log:
>>
>> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=-1)
>> [sp.c:sp_recv_msg:34]: Message recv : type 210, len 0
>> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
>> [sp.c:sp_send_msg:64]: Message send : type 230, len 0
>> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
>> [sp.c:sp_recv_msg:34]: Message recv : type 300, len 220
>> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
>> [tr_functions.cpp:on_user_statement_begin:102]: User query:
>> --- categorytree:getCategoryTree()
>> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
>> [sp.c:sp_send_msg:64]: Message send : type 320, len 0
>> LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
>> [sp.c:sp_send_msg:64]: Message send : type 360, len 10240
>> ... plenty of lines like this ...
>> LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
>> [sp.c:sp_send_ms

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-28 Thread Ivan Lagunov
I've faced two more hangings today. Again I'm posting only those
connections that were created by our suspected application. Besides, I'm
adding the last debug Message from the event.log for each transaction.

https://www.dropbox.com/s/istotjr41vrmmd5/20141028001023.ZIP?dl=0

se_trn 21097 - fast XQuery, COMMITED, Message recv : type 210, len 0
se_trn 21114 - XQuery categorytree:getCategoryTree(), not COMMITED, gdb
thread 1 in select, Message send : type 370, len 0
se_trn 21127 - fast XQuery, COMMITED, Message recv : type 210, len 0
se_trn 21229 - Session is ready

So this hanging issue looks completely the same as those of yesterday.
However, yesterday's problem has been fixed and now it happens roughly once
in several hundreds cases.

https://www.dropbox.com/s/ru6r6rplar3fat9/20141028151343.ZIP?dl=0

se_trn 27478 - fast XQuery basictypes:getBasicType(), not COMMITED, gdb
thread 1 in select, Message send : type 370, len 0
se_trn 27932 - fast XQuery, COMMITED, Message recv : type 210, len 0
se_trn 28030 - Session is ready

Obviously, the issue has not gone. Moreover, the second case confirms that
the database can hang on various XQueries as before. As the yesterday's
issue looks really the same but just worse, I've come to a conclusion that
other hangings may be related to the Java version too. I'm going to try
running the application under a different Java version.

@Ilya, could you comment on the provided event.log with the debug messages
generated by your patch? I'm curious if it can give better insight into the
problem.

Best regards,
Ivan Lagunov

On Mon, Oct 27, 2014 at 5:46 PM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> Finally I've got some clues about Sedna hangings. Today the same
> application that is under suspicion has been deployed by another developer.
> It has been built on the developer's machine so apparently the JDK has been
> different from the JDK that is used for running the application on the
> server. I'm still waiting for more details on this matter. But this
> deployment has resulted in a series of Sedna hangings. Although it differs
> from the hanging scenarios that we had faced before, it may still be
> somehow related.
>
> Here is the archive with logs of the first hanging issue of today:
>
> https://www.dropbox.com/s/ufwayhwx7ersy88/20141027131246.ZIP?dl=0
>
> These are the transaction processes created by the suspected application:
> se_trn 17209 - XQuery categorytree:getCategoryTree(), not COMMITED, gdb
> thread 1 in select
> se_trn 17217 - Session is ready
>
> All other today's hangings have been caused by the same XQuery
> categorytree:getCategoryTree(). It simply loads and returns a big document
> TreeExport.xml (about 2.5MB) from the collection "treeStructures/released".
> It's very likely that the "Sedna index subsystem error" from my previous
> email is also related to this. There have been no more hangings after I
> rebuilt and redeployed the Java application on the server using the same
> "good" JDK that is used for running.
>
> @Ilya, could you please look into the event.log and say if the client
> indeed makes Sedna wait as you had suggested before?
>
> This is an extract from event.log:
>
> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=-1)
> [sp.c:sp_recv_msg:34]: Message recv : type 210, len 0
> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
> [sp.c:sp_send_msg:64]: Message send : type 230, len 0
> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
> [sp.c:sp_recv_msg:34]: Message recv : type 300, len 220
> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
> [tr_functions.cpp:on_user_statement_begin:102]: User query:
> --- categorytree:getCategoryTree()
> LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
> [sp.c:sp_send_msg:64]: Message send : type 320, len 0
> LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
> [sp.c:sp_send_msg:64]: Message send : type 360, len 10240
> ... plenty of lines like this ...
> LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
> [sp.c:sp_send_msg:64]: Message send : type 360, len 10240
> LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
> [sp.c:sp_send_msg:64]: Message send : type 360, len 8204
> LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
> [sp.c:sp_send_msg:64]: Message send : type 370, len 0
>
> Just to clarify about our earlier hangings, afaik all the deployments of
> the suspected application were performed on the server using "good" JDK
> earlier. So while earlier hangings might have been somehow related to
> today's, the situation is quite different.
>
> Best regards,
> Ivan Lagunov
>
>
> On Mon, Oct 27, 2014 at 12:46 PM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> The patch has been applied to the night build 3.6.519, the data has been
>> migrated with se_exp using complete export-import. The issue has not appear
>> again for a week already. I'll keep monitoring Sedna.
>>
>> However, it seems there is another issue with that night 

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-27 Thread Ivan Lagunov
Hi Ilya,

Finally I've got some clues about Sedna hangings. Today the same
application that is under suspicion has been deployed by another developer.
It has been built on the developer's machine so apparently the JDK has been
different from the JDK that is used for running the application on the
server. I'm still waiting for more details on this matter. But this
deployment has resulted in a series of Sedna hangings. Although it differs
from the hanging scenarios that we had faced before, it may still be
somehow related.

Here is the archive with logs of the first hanging issue of today:

https://www.dropbox.com/s/ufwayhwx7ersy88/20141027131246.ZIP?dl=0

These are the transaction processes created by the suspected application:
se_trn 17209 - XQuery categorytree:getCategoryTree(), not COMMITED, gdb
thread 1 in select
se_trn 17217 - Session is ready

All other today's hangings have been caused by the same XQuery
categorytree:getCategoryTree(). It simply loads and returns a big document
TreeExport.xml (about 2.5MB) from the collection "treeStructures/released".
It's very likely that the "Sedna index subsystem error" from my previous
email is also related to this. There have been no more hangings after I
rebuilt and redeployed the Java application on the server using the same
"good" JDK that is used for running.

@Ilya, could you please look into the event.log and say if the client
indeed makes Sedna wait as you had suggested before?

This is an extract from event.log:

LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=-1)
[sp.c:sp_recv_msg:34]: Message recv : type 210, len 0
LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
[sp.c:sp_send_msg:64]: Message send : type 230, len 0
LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
[sp.c:sp_recv_msg:34]: Message recv : type 300, len 220
LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
[tr_functions.cpp:on_user_statement_begin:102]: User query:
--- categorytree:getCategoryTree()
LOG   27/10/2014 12:51:43 (TRN nxp pid=17209 sid=44 trid=48)
[sp.c:sp_send_msg:64]: Message send : type 320, len 0
LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
[sp.c:sp_send_msg:64]: Message send : type 360, len 10240
... plenty of lines like this ...
LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
[sp.c:sp_send_msg:64]: Message send : type 360, len 10240
LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
[sp.c:sp_send_msg:64]: Message send : type 360, len 8204
LOG   27/10/2014 12:51:45 (TRN nxp pid=17209 sid=44 trid=48)
[sp.c:sp_send_msg:64]: Message send : type 370, len 0

Just to clarify about our earlier hangings, afaik all the deployments of
the suspected application were performed on the server using "good" JDK
earlier. So while earlier hangings might have been somehow related to
today's, the situation is quite different.

Best regards,
Ivan Lagunov


On Mon, Oct 27, 2014 at 12:46 PM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> The patch has been applied to the night build 3.6.519, the data has been
> migrated with se_exp using complete export-import. The issue has not appear
> again for a week already. I'll keep monitoring Sedna.
>
> However, it seems there is another issue with that night build:
>
> LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=-1)
> [sp.c:sp_send_msg:64]: Message send : type 250, len 0
> LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=-1)
> [sp.c:sp_recv_msg:34]: Message recv : type 210, len 0
> LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=49)
> [sp.c:sp_send_msg:64]: Message send : type 230, len 0
> LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=49)
> [sp.c:sp_recv_msg:34]: Message recv : type 300, len 76
> LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=49)
> [tr_functions.cpp:on_user_statement_begin:102]: User query:
> ---   DROP DOCUMENT "TreeExport.xml" IN COLLECTION
> "treeStructures/released"
> ERROR 27/10/2014 01:21:20 (TRN nxp pid=20091 sid=49 trid=49)
> [btree.cpp:bt_delete_tmpl:273]: (SE1008) Sedna index subsystem error.
> Details: Cannot delete object which is not in the btree
> LOG   27/10/2014 01:21:20 (TRN nxp pid=20091 sid=49 trid=-1)
> [tr_common_funcs.cpp:on_transaction_end:416]: Transaction has been ROLLED
> BACK
> LOG   27/10/2014 01:21:20 (TRN nxp pid=20091 sid=49 trid=-1)
> [sp.c:sp_send_msg:64]: Message send : type 100, len 122
> LOG   27/10/2014 01:21:25 (TRN nxp pid=20091 sid=49 trid=-1)
> [sp.c:sp_recv_msg:34]: Message recv : type 500, len 0
> LOG   27/10/2014 01:21:25 (TRN nxp pid=20091 sid=49 trid=-1)
> [sp.c:sp_send_msg:64]: Message send : type 510, len 0
> INFO  27/10/2014 01:21:25 (SM nxp pid=19377)
> [bm_core.cpp:bm_log_out_io_statistics:537]: IO block reads:0, writes:0
> LOG   27/10/2014 01:21:25 (TRN nxp pid=20091 sid=49 trid=-1)
> [tr.cpp:TRmain:456]: Session is closed
>
> It's persistent, I could not delete the document at all until I deleted
> the indexes on that collection:
>on_path="descendant::element

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-27 Thread Ivan Lagunov
Hi Ilya,

The patch has been applied to the night build 3.6.519, the data has been
migrated with se_exp using complete export-import. The issue has not appear
again for a week already. I'll keep monitoring Sedna.

However, it seems there is another issue with that night build:

LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=-1)
[sp.c:sp_send_msg:64]: Message send : type 250, len 0
LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=-1)
[sp.c:sp_recv_msg:34]: Message recv : type 210, len 0
LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=49)
[sp.c:sp_send_msg:64]: Message send : type 230, len 0
LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=49)
[sp.c:sp_recv_msg:34]: Message recv : type 300, len 76
LOG   27/10/2014 01:21:18 (TRN nxp pid=20091 sid=49 trid=49)
[tr_functions.cpp:on_user_statement_begin:102]: User query:
---   DROP DOCUMENT "TreeExport.xml" IN COLLECTION "treeStructures/released"
ERROR 27/10/2014 01:21:20 (TRN nxp pid=20091 sid=49 trid=49)
[btree.cpp:bt_delete_tmpl:273]: (SE1008) Sedna index subsystem error.
Details: Cannot delete object which is not in the btree
LOG   27/10/2014 01:21:20 (TRN nxp pid=20091 sid=49 trid=-1)
[tr_common_funcs.cpp:on_transaction_end:416]: Transaction has been ROLLED
BACK
LOG   27/10/2014 01:21:20 (TRN nxp pid=20091 sid=49 trid=-1)
[sp.c:sp_send_msg:64]: Message send : type 100, len 122
LOG   27/10/2014 01:21:25 (TRN nxp pid=20091 sid=49 trid=-1)
[sp.c:sp_recv_msg:34]: Message recv : type 500, len 0
LOG   27/10/2014 01:21:25 (TRN nxp pid=20091 sid=49 trid=-1)
[sp.c:sp_send_msg:64]: Message send : type 510, len 0
INFO  27/10/2014 01:21:25 (SM nxp pid=19377)
[bm_core.cpp:bm_log_out_io_statistics:537]: IO block reads:0, writes:0
LOG   27/10/2014 01:21:25 (TRN nxp pid=20091 sid=49 trid=-1)
[tr.cpp:TRmain:456]: Session is closed

It's persistent, I could not delete the document at all until I deleted the
indexes on that collection:
  
  

Any ideas?

Best regards,
Ivan Lagunov


On Fri, Oct 17, 2014 at 10:05 AM, Ilya Taranov  wrote:

> Hi, here is the patch over the latest label that should add protocol
> minimal debugging.
>
> On Wed, Oct 15, 2014 at 9:53 AM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> As it's stated in the email topic, I use Sedna 3.5.615. It was built
>> from sources with RelWithDebugInfo flag enabled on the RHEL linux
>> server. If you want to add some additional debug info, I'd suggest adding
>> it on top of the latest night build though. I'm planning to upgrade anyway
>> as there were quite some nice features implemented.
>>
>> @Charles, okay, nice to know this. If only I could provide a consistently
>> failing test case. Currently the issue is quite mysterious.
>>
>> Best regards,
>> Ivan Lagunov
>>
>> On Wed, Oct 15, 2014 at 11:02 AM, Ivan Shcheklein 
>> wrote:
>>
>>> Ilya,
>>>
>>> As far as I remember. It is almost the latest dev sources built with
>>> relWithDebInfo. Internal data structures should be compatible.
>>>
>>> Ivan
>>>
>>> On Wed, Oct 15, 2014 at 2:08 AM, Ilya Taranov 
>>> wrote:
>>>
 I'm not sure, latest build will fix the problem.

 It's definitely some client-to-app communication problem.
 If I make a git patch for you with additional debug info, will you be
 able to apply it (assuming, the data file version will be the same)? What
 build do you currently use?

 On Sat, Oct 11, 2014 at 12:48 AM, Ivan Lagunov 
 wrote:

> Hi Ilya,
>
> I've analyzed logs for other issue occurrences and it has appeared
> more complicated.
>
> First, could you please clarify about "Session is ready" statement,
> when exactly is it dumped to the log file? I'm wondering because when the
> database is hanging, those log entries still appear at some random moments
> of time.
>
> Second, I'm going to provide the investigation details for several
> more issue occurrences here. Below "fast XQuery" means that it uses index
> and is expected to be very fast.
>
> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>
> se_trn 27173 - DROP DOCUMENT operation, COMMITED
> se_gov 29197
> se_sm 29224
> se_trn 29617 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
> thread 1 in select
> se_trn 29820 - Session is ready
> se_trn 29976 - Session is ready
>
> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
>
> se_gov 15890
> se_sm 15899
> se_trn 21611 - doc-available XQuery, not COMMITED, gbd thread 1 in
> nanosleep
> se_trn 21709 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21713 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21716 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21719 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21727 - LOAD STDIN operation, not COMMITED, gbd thread 1 in
> nanosleep
> se_trn 21996 - fast XQuery

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-17 Thread Ilya Taranov
Hi, here is the patch over the latest label that should add protocol
minimal debugging.

On Wed, Oct 15, 2014 at 9:53 AM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> As it's stated in the email topic, I use Sedna 3.5.615. It was built from
> sources with RelWithDebugInfo flag enabled on the RHEL linux server. If
> you want to add some additional debug info, I'd suggest adding it on top of
> the latest night build though. I'm planning to upgrade anyway as there were
> quite some nice features implemented.
>
> @Charles, okay, nice to know this. If only I could provide a consistently
> failing test case. Currently the issue is quite mysterious.
>
> Best regards,
> Ivan Lagunov
>
> On Wed, Oct 15, 2014 at 11:02 AM, Ivan Shcheklein 
> wrote:
>
>> Ilya,
>>
>> As far as I remember. It is almost the latest dev sources built with
>> relWithDebInfo. Internal data structures should be compatible.
>>
>> Ivan
>>
>> On Wed, Oct 15, 2014 at 2:08 AM, Ilya Taranov 
>> wrote:
>>
>>> I'm not sure, latest build will fix the problem.
>>>
>>> It's definitely some client-to-app communication problem.
>>> If I make a git patch for you with additional debug info, will you be
>>> able to apply it (assuming, the data file version will be the same)? What
>>> build do you currently use?
>>>
>>> On Sat, Oct 11, 2014 at 12:48 AM, Ivan Lagunov 
>>> wrote:
>>>
 Hi Ilya,

 I've analyzed logs for other issue occurrences and it has appeared more
 complicated.

 First, could you please clarify about "Session is ready" statement,
 when exactly is it dumped to the log file? I'm wondering because when the
 database is hanging, those log entries still appear at some random moments
 of time.

 Second, I'm going to provide the investigation details for several more
 issue occurrences here. Below "fast XQuery" means that it uses index and is
 expected to be very fast.

 https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0

 se_trn 27173 - DROP DOCUMENT operation, COMMITED
 se_gov 29197
 se_sm 29224
 se_trn 29617 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
 thread 1 in select
 se_trn 29820 - Session is ready
 se_trn 29976 - Session is ready

 https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0

 se_gov 15890
 se_sm 15899
 se_trn 21611 - doc-available XQuery, not COMMITED, gbd thread 1 in
 nanosleep
 se_trn 21709 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
 se_trn 21713 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
 se_trn 21716 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
 se_trn 21719 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
 se_trn 21727 - LOAD STDIN operation, not COMMITED, gbd thread 1 in
 nanosleep
 se_trn 21996 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
 thread 1 in select
 se_trn 22002 - document("$collections") XQuery, not COMMITED, gdb
 thread 1 in nanosleep
 se_trn 22010 - Session is ready
 se_trn 22013 - Session is ready
 se_trn 22152 - Session is ready
 se_trn 22295 - Session is ready
 se_trn 23615 - Session is ready
 se_trn 24315 - Session is ready

 Questions:
 1. Does nanosleep state mean that the operation is waiting for another
 one (here it's probably 21996 XQuery in select)?
 2. Why does doc-available wait for any other operation? I'd expect it
 to be non-blocking and to run immediately on the latest data snapshot.
 3. Why does doc('$version') wait for any other operation? It's a
 database metadata being fetched, so I'd also expect it to run immediately.

 https://www.dropbox.com/s/3ibo4dvzapfdirw/sedna_debug_info3.zip?dl=0

 se_trn 641 - fast XQuery packages:getPackage, not COMMITED, gdb thread
 1 in select
 se_trn 645 - Session is ready & COMMITED without operation
 se_trn 648 - Session is ready
 se_trn 823 - Session is ready
 se_trn 898 - Session is ready
 se_gov 8444
 se_sm 8453

 Questions:
 What is going on with 645? There is "Session in ready" statement
 immediately followed by "COMMITED" statement in the event.log. What
 operation was committed? Is it possible the order of statements in the
 event.log is inconsistent? Even so there is no more statements with
 pid=645, so no operation to be seen that could be committed. Very
 suspicious.

 https://www.dropbox.com/s/38idxmpzjv6rvxf/sedna_debug_info4.zip?dl=0

 se_gov 1532
 se_sm 1711
 se_trn 6837 - DROP DOCUMENT operation, not COMMITED, gdb thread 1 in
 nanosleep
 se_trn 11703 - LOAD STDIN operation, not COMMITED, gdb thread 1 in
 nanosleep
 se_trn 11944 - document("$collections") XQuery, not COMMITED, gdb
 thread 1 in nanosleep
 se_trn 11972 - document("$collections") XQuery, not COMMITED, gdb
 thread 1 in nanosleep
 se_tr

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-15 Thread Ivan Lagunov
Hi Ilya,

As it's stated in the email topic, I use Sedna 3.5.615. It was built from
sources with RelWithDebugInfo flag enabled on the RHEL linux server. If you
want to add some additional debug info, I'd suggest adding it on top of the
latest night build though. I'm planning to upgrade anyway as there were
quite some nice features implemented.

@Charles, okay, nice to know this. If only I could provide a consistently
failing test case. Currently the issue is quite mysterious.

Best regards,
Ivan Lagunov

On Wed, Oct 15, 2014 at 11:02 AM, Ivan Shcheklein 
wrote:

> Ilya,
>
> As far as I remember. It is almost the latest dev sources built with
> relWithDebInfo. Internal data structures should be compatible.
>
> Ivan
>
> On Wed, Oct 15, 2014 at 2:08 AM, Ilya Taranov 
> wrote:
>
>> I'm not sure, latest build will fix the problem.
>>
>> It's definitely some client-to-app communication problem.
>> If I make a git patch for you with additional debug info, will you be
>> able to apply it (assuming, the data file version will be the same)? What
>> build do you currently use?
>>
>> On Sat, Oct 11, 2014 at 12:48 AM, Ivan Lagunov  wrote:
>>
>>> Hi Ilya,
>>>
>>> I've analyzed logs for other issue occurrences and it has appeared more
>>> complicated.
>>>
>>> First, could you please clarify about "Session is ready" statement, when
>>> exactly is it dumped to the log file? I'm wondering because when the
>>> database is hanging, those log entries still appear at some random moments
>>> of time.
>>>
>>> Second, I'm going to provide the investigation details for several more
>>> issue occurrences here. Below "fast XQuery" means that it uses index and is
>>> expected to be very fast.
>>>
>>> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>>>
>>> se_trn 27173 - DROP DOCUMENT operation, COMMITED
>>> se_gov 29197
>>> se_sm 29224
>>> se_trn 29617 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>>> thread 1 in select
>>> se_trn 29820 - Session is ready
>>> se_trn 29976 - Session is ready
>>>
>>> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
>>>
>>> se_gov 15890
>>> se_sm 15899
>>> se_trn 21611 - doc-available XQuery, not COMMITED, gbd thread 1 in
>>> nanosleep
>>> se_trn 21709 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>>> se_trn 21713 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>>> se_trn 21716 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>>> se_trn 21719 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>>> se_trn 21727 - LOAD STDIN operation, not COMMITED, gbd thread 1 in
>>> nanosleep
>>> se_trn 21996 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>>> thread 1 in select
>>> se_trn 22002 - document("$collections") XQuery, not COMMITED, gdb thread
>>> 1 in nanosleep
>>> se_trn 22010 - Session is ready
>>> se_trn 22013 - Session is ready
>>> se_trn 22152 - Session is ready
>>> se_trn 22295 - Session is ready
>>> se_trn 23615 - Session is ready
>>> se_trn 24315 - Session is ready
>>>
>>> Questions:
>>> 1. Does nanosleep state mean that the operation is waiting for another
>>> one (here it's probably 21996 XQuery in select)?
>>> 2. Why does doc-available wait for any other operation? I'd expect it to
>>> be non-blocking and to run immediately on the latest data snapshot.
>>> 3. Why does doc('$version') wait for any other operation? It's a
>>> database metadata being fetched, so I'd also expect it to run immediately.
>>>
>>> https://www.dropbox.com/s/3ibo4dvzapfdirw/sedna_debug_info3.zip?dl=0
>>>
>>> se_trn 641 - fast XQuery packages:getPackage, not COMMITED, gdb thread 1
>>> in select
>>> se_trn 645 - Session is ready & COMMITED without operation
>>> se_trn 648 - Session is ready
>>> se_trn 823 - Session is ready
>>> se_trn 898 - Session is ready
>>> se_gov 8444
>>> se_sm 8453
>>>
>>> Questions:
>>> What is going on with 645? There is "Session in ready" statement
>>> immediately followed by "COMMITED" statement in the event.log. What
>>> operation was committed? Is it possible the order of statements in the
>>> event.log is inconsistent? Even so there is no more statements with
>>> pid=645, so no operation to be seen that could be committed. Very
>>> suspicious.
>>>
>>> https://www.dropbox.com/s/38idxmpzjv6rvxf/sedna_debug_info4.zip?dl=0
>>>
>>> se_gov 1532
>>> se_sm 1711
>>> se_trn 6837 - DROP DOCUMENT operation, not COMMITED, gdb thread 1 in
>>> nanosleep
>>> se_trn 11703 - LOAD STDIN operation, not COMMITED, gdb thread 1 in
>>> nanosleep
>>> se_trn 11944 - document("$collections") XQuery, not COMMITED, gdb thread
>>> 1 in nanosleep
>>> se_trn 11972 - document("$collections") XQuery, not COMMITED, gdb thread
>>> 1 in nanosleep
>>> se_trn 12139 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>>> gdb thread 1 in select
>>> se_trn 12183 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>>> thread 1 in select
>>> se_trn 12186 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>>> gdb thread 1 i

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-15 Thread Ivan Shcheklein
Ilya,

As far as I remember. It is almost the latest dev sources built with
relWithDebInfo. Internal data structures should be compatible.

Ivan

On Wed, Oct 15, 2014 at 2:08 AM, Ilya Taranov  wrote:

> I'm not sure, latest build will fix the problem.
>
> It's definitely some client-to-app communication problem.
> If I make a git patch for you with additional debug info, will you be able
> to apply it (assuming, the data file version will be the same)? What build
> do you currently use?
>
> On Sat, Oct 11, 2014 at 12:48 AM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> I've analyzed logs for other issue occurrences and it has appeared more
>> complicated.
>>
>> First, could you please clarify about "Session is ready" statement, when
>> exactly is it dumped to the log file? I'm wondering because when the
>> database is hanging, those log entries still appear at some random moments
>> of time.
>>
>> Second, I'm going to provide the investigation details for several more
>> issue occurrences here. Below "fast XQuery" means that it uses index and is
>> expected to be very fast.
>>
>> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>>
>> se_trn 27173 - DROP DOCUMENT operation, COMMITED
>> se_gov 29197
>> se_sm 29224
>> se_trn 29617 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>> thread 1 in select
>> se_trn 29820 - Session is ready
>> se_trn 29976 - Session is ready
>>
>> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
>>
>> se_gov 15890
>> se_sm 15899
>> se_trn 21611 - doc-available XQuery, not COMMITED, gbd thread 1 in
>> nanosleep
>> se_trn 21709 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21713 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21716 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21719 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21727 - LOAD STDIN operation, not COMMITED, gbd thread 1 in
>> nanosleep
>> se_trn 21996 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>> thread 1 in select
>> se_trn 22002 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 22010 - Session is ready
>> se_trn 22013 - Session is ready
>> se_trn 22152 - Session is ready
>> se_trn 22295 - Session is ready
>> se_trn 23615 - Session is ready
>> se_trn 24315 - Session is ready
>>
>> Questions:
>> 1. Does nanosleep state mean that the operation is waiting for another
>> one (here it's probably 21996 XQuery in select)?
>> 2. Why does doc-available wait for any other operation? I'd expect it to
>> be non-blocking and to run immediately on the latest data snapshot.
>> 3. Why does doc('$version') wait for any other operation? It's a database
>> metadata being fetched, so I'd also expect it to run immediately.
>>
>> https://www.dropbox.com/s/3ibo4dvzapfdirw/sedna_debug_info3.zip?dl=0
>>
>> se_trn 641 - fast XQuery packages:getPackage, not COMMITED, gdb thread 1
>> in select
>> se_trn 645 - Session is ready & COMMITED without operation
>> se_trn 648 - Session is ready
>> se_trn 823 - Session is ready
>> se_trn 898 - Session is ready
>> se_gov 8444
>> se_sm 8453
>>
>> Questions:
>> What is going on with 645? There is "Session in ready" statement
>> immediately followed by "COMMITED" statement in the event.log. What
>> operation was committed? Is it possible the order of statements in the
>> event.log is inconsistent? Even so there is no more statements with
>> pid=645, so no operation to be seen that could be committed. Very
>> suspicious.
>>
>> https://www.dropbox.com/s/38idxmpzjv6rvxf/sedna_debug_info4.zip?dl=0
>>
>> se_gov 1532
>> se_sm 1711
>> se_trn 6837 - DROP DOCUMENT operation, not COMMITED, gdb thread 1 in
>> nanosleep
>> se_trn 11703 - LOAD STDIN operation, not COMMITED, gdb thread 1 in
>> nanosleep
>> se_trn 11944 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 11972 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 12139 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>> gdb thread 1 in select
>> se_trn 12183 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>> thread 1 in select
>> se_trn 12186 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>> gdb thread 1 in select
>> se_trn 12195 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>> gdb thread 1 in select
>> se_trn 1 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 12302 - fast XQuery documents:getFilesByBasicTypeId, not COMMITED,
>> gdb thread 1 in nanosleep
>> se_trn 12307 - document("$documents") XQuery, not COMMITED, gdb thread 1
>> in nanosleep
>> se_trn 12310 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 12377 - Session is ready
>> se_trn 12382 - Session is ready
>> se_trn 12423 - Session is ready
>> se_trn 12848 - Session is ready
>>
>> Comment:
>> I understand many of those XQueri

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-15 Thread Charles Foster
Hi Ivan,

Just to clarify (not sure if my previous e-mail got through), Sedna XML:DB
API support is "still available" through either a support contract or paid
individual bug fixes. Please contact me directly for more information.

Kind Regards,

Charles


On 14 October 2014 23:08, Ilya Taranov  wrote:

> I'm not sure, latest build will fix the problem.
>
> It's definitely some client-to-app communication problem.
> If I make a git patch for you with additional debug info, will you be able
> to apply it (assuming, the data file version will be the same)? What build
> do you currently use?
>
> On Sat, Oct 11, 2014 at 12:48 AM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> I've analyzed logs for other issue occurrences and it has appeared more
>> complicated.
>>
>> First, could you please clarify about "Session is ready" statement, when
>> exactly is it dumped to the log file? I'm wondering because when the
>> database is hanging, those log entries still appear at some random moments
>> of time.
>>
>> Second, I'm going to provide the investigation details for several more
>> issue occurrences here. Below "fast XQuery" means that it uses index and is
>> expected to be very fast.
>>
>> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>>
>> se_trn 27173 - DROP DOCUMENT operation, COMMITED
>> se_gov 29197
>> se_sm 29224
>> se_trn 29617 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>> thread 1 in select
>> se_trn 29820 - Session is ready
>> se_trn 29976 - Session is ready
>>
>> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
>>
>> se_gov 15890
>> se_sm 15899
>> se_trn 21611 - doc-available XQuery, not COMMITED, gbd thread 1 in
>> nanosleep
>> se_trn 21709 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21713 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21716 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21719 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
>> se_trn 21727 - LOAD STDIN operation, not COMMITED, gbd thread 1 in
>> nanosleep
>> se_trn 21996 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>> thread 1 in select
>> se_trn 22002 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 22010 - Session is ready
>> se_trn 22013 - Session is ready
>> se_trn 22152 - Session is ready
>> se_trn 22295 - Session is ready
>> se_trn 23615 - Session is ready
>> se_trn 24315 - Session is ready
>>
>> Questions:
>> 1. Does nanosleep state mean that the operation is waiting for another
>> one (here it's probably 21996 XQuery in select)?
>> 2. Why does doc-available wait for any other operation? I'd expect it to
>> be non-blocking and to run immediately on the latest data snapshot.
>> 3. Why does doc('$version') wait for any other operation? It's a database
>> metadata being fetched, so I'd also expect it to run immediately.
>>
>> https://www.dropbox.com/s/3ibo4dvzapfdirw/sedna_debug_info3.zip?dl=0
>>
>> se_trn 641 - fast XQuery packages:getPackage, not COMMITED, gdb thread 1
>> in select
>> se_trn 645 - Session is ready & COMMITED without operation
>> se_trn 648 - Session is ready
>> se_trn 823 - Session is ready
>> se_trn 898 - Session is ready
>> se_gov 8444
>> se_sm 8453
>>
>> Questions:
>> What is going on with 645? There is "Session in ready" statement
>> immediately followed by "COMMITED" statement in the event.log. What
>> operation was committed? Is it possible the order of statements in the
>> event.log is inconsistent? Even so there is no more statements with
>> pid=645, so no operation to be seen that could be committed. Very
>> suspicious.
>>
>> https://www.dropbox.com/s/38idxmpzjv6rvxf/sedna_debug_info4.zip?dl=0
>>
>> se_gov 1532
>> se_sm 1711
>> se_trn 6837 - DROP DOCUMENT operation, not COMMITED, gdb thread 1 in
>> nanosleep
>> se_trn 11703 - LOAD STDIN operation, not COMMITED, gdb thread 1 in
>> nanosleep
>> se_trn 11944 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 11972 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 12139 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>> gdb thread 1 in select
>> se_trn 12183 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
>> thread 1 in select
>> se_trn 12186 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>> gdb thread 1 in select
>> se_trn 12195 - XQuery collection("groupingDefinitions/wip"), COMMITED,
>> gdb thread 1 in select
>> se_trn 1 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 12302 - fast XQuery documents:getFilesByBasicTypeId, not COMMITED,
>> gdb thread 1 in nanosleep
>> se_trn 12307 - document("$documents") XQuery, not COMMITED, gdb thread 1
>> in nanosleep
>> se_trn 12310 - document("$collections") XQuery, not COMMITED, gdb thread
>> 1 in nanosleep
>> se_trn 12377 - Session is ready
>> se_trn 12382 - Session is ready
>> se_trn 12423 -

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-14 Thread Ilya Taranov
I'm not sure, latest build will fix the problem.

It's definitely some client-to-app communication problem.
If I make a git patch for you with additional debug info, will you be able
to apply it (assuming, the data file version will be the same)? What build
do you currently use?

On Sat, Oct 11, 2014 at 12:48 AM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> I've analyzed logs for other issue occurrences and it has appeared more
> complicated.
>
> First, could you please clarify about "Session is ready" statement, when
> exactly is it dumped to the log file? I'm wondering because when the
> database is hanging, those log entries still appear at some random moments
> of time.
>
> Second, I'm going to provide the investigation details for several more
> issue occurrences here. Below "fast XQuery" means that it uses index and is
> expected to be very fast.
>
> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>
> se_trn 27173 - DROP DOCUMENT operation, COMMITED
> se_gov 29197
> se_sm 29224
> se_trn 29617 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
> thread 1 in select
> se_trn 29820 - Session is ready
> se_trn 29976 - Session is ready
>
> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
>
> se_gov 15890
> se_sm 15899
> se_trn 21611 - doc-available XQuery, not COMMITED, gbd thread 1 in
> nanosleep
> se_trn 21709 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21713 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21716 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21719 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
> se_trn 21727 - LOAD STDIN operation, not COMMITED, gbd thread 1 in
> nanosleep
> se_trn 21996 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
> thread 1 in select
> se_trn 22002 - document("$collections") XQuery, not COMMITED, gdb thread 1
> in nanosleep
> se_trn 22010 - Session is ready
> se_trn 22013 - Session is ready
> se_trn 22152 - Session is ready
> se_trn 22295 - Session is ready
> se_trn 23615 - Session is ready
> se_trn 24315 - Session is ready
>
> Questions:
> 1. Does nanosleep state mean that the operation is waiting for another one
> (here it's probably 21996 XQuery in select)?
> 2. Why does doc-available wait for any other operation? I'd expect it to
> be non-blocking and to run immediately on the latest data snapshot.
> 3. Why does doc('$version') wait for any other operation? It's a database
> metadata being fetched, so I'd also expect it to run immediately.
>
> https://www.dropbox.com/s/3ibo4dvzapfdirw/sedna_debug_info3.zip?dl=0
>
> se_trn 641 - fast XQuery packages:getPackage, not COMMITED, gdb thread 1
> in select
> se_trn 645 - Session is ready & COMMITED without operation
> se_trn 648 - Session is ready
> se_trn 823 - Session is ready
> se_trn 898 - Session is ready
> se_gov 8444
> se_sm 8453
>
> Questions:
> What is going on with 645? There is "Session in ready" statement
> immediately followed by "COMMITED" statement in the event.log. What
> operation was committed? Is it possible the order of statements in the
> event.log is inconsistent? Even so there is no more statements with
> pid=645, so no operation to be seen that could be committed. Very
> suspicious.
>
> https://www.dropbox.com/s/38idxmpzjv6rvxf/sedna_debug_info4.zip?dl=0
>
> se_gov 1532
> se_sm 1711
> se_trn 6837 - DROP DOCUMENT operation, not COMMITED, gdb thread 1 in
> nanosleep
> se_trn 11703 - LOAD STDIN operation, not COMMITED, gdb thread 1 in
> nanosleep
> se_trn 11944 - document("$collections") XQuery, not COMMITED, gdb thread 1
> in nanosleep
> se_trn 11972 - document("$collections") XQuery, not COMMITED, gdb thread 1
> in nanosleep
> se_trn 12139 - XQuery collection("groupingDefinitions/wip"), COMMITED, gdb
> thread 1 in select
> se_trn 12183 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
> thread 1 in select
> se_trn 12186 - XQuery collection("groupingDefinitions/wip"), COMMITED, gdb
> thread 1 in select
> se_trn 12195 - XQuery collection("groupingDefinitions/wip"), COMMITED, gdb
> thread 1 in select
> se_trn 1 - document("$collections") XQuery, not COMMITED, gdb thread 1
> in nanosleep
> se_trn 12302 - fast XQuery documents:getFilesByBasicTypeId, not COMMITED,
> gdb thread 1 in nanosleep
> se_trn 12307 - document("$documents") XQuery, not COMMITED, gdb thread 1
> in nanosleep
> se_trn 12310 - document("$collections") XQuery, not COMMITED, gdb thread 1
> in nanosleep
> se_trn 12377 - Session is ready
> se_trn 12382 - Session is ready
> se_trn 12423 - Session is ready
> se_trn 12848 - Session is ready
>
> Comment:
> I understand many of those XQueries are not optimal, they are part of an
> application developed by other devs. Now I'll take care of improving those
> operations to replace document("$documents") and document("$collections")
> with doc-available and col-available accordingly and to use an index for
> collection("groupingDefinitions/wip").
>
> Overall conc

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-11 Thread Ivan Lagunov
Hi Ilya,

I've analyzed logs for other issue occurrences and it has appeared more
complicated.

First, could you please clarify about "Session is ready" statement, when
exactly is it dumped to the log file? I'm wondering because when the
database is hanging, those log entries still appear at some random moments
of time.

Second, I'm going to provide the investigation details for several more
issue occurrences here. Below "fast XQuery" means that it uses index and is
expected to be very fast.

https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0

se_trn 27173 - DROP DOCUMENT operation, COMMITED
se_gov 29197
se_sm 29224
se_trn 29617 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
thread 1 in select
se_trn 29820 - Session is ready
se_trn 29976 - Session is ready

https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0

se_gov 15890
se_sm 15899
se_trn 21611 - doc-available XQuery, not COMMITED, gbd thread 1 in nanosleep
se_trn 21709 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
se_trn 21713 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
se_trn 21716 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
se_trn 21719 - doc('$version') XQuery, COMMITED, gdb thread 1 in select
se_trn 21727 - LOAD STDIN operation, not COMMITED, gbd thread 1 in nanosleep
se_trn 21996 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
thread 1 in select
se_trn 22002 - document("$collections") XQuery, not COMMITED, gdb thread 1
in nanosleep
se_trn 22010 - Session is ready
se_trn 22013 - Session is ready
se_trn 22152 - Session is ready
se_trn 22295 - Session is ready
se_trn 23615 - Session is ready
se_trn 24315 - Session is ready

Questions:
1. Does nanosleep state mean that the operation is waiting for another one
(here it's probably 21996 XQuery in select)?
2. Why does doc-available wait for any other operation? I'd expect it to be
non-blocking and to run immediately on the latest data snapshot.
3. Why does doc('$version') wait for any other operation? It's a database
metadata being fetched, so I'd also expect it to run immediately.

https://www.dropbox.com/s/3ibo4dvzapfdirw/sedna_debug_info3.zip?dl=0

se_trn 641 - fast XQuery packages:getPackage, not COMMITED, gdb thread 1 in
select
se_trn 645 - Session is ready & COMMITED without operation
se_trn 648 - Session is ready
se_trn 823 - Session is ready
se_trn 898 - Session is ready
se_gov 8444
se_sm 8453

Questions:
What is going on with 645? There is "Session in ready" statement
immediately followed by "COMMITED" statement in the event.log. What
operation was committed? Is it possible the order of statements in the
event.log is inconsistent? Even so there is no more statements with
pid=645, so no operation to be seen that could be committed. Very
suspicious.

https://www.dropbox.com/s/38idxmpzjv6rvxf/sedna_debug_info4.zip?dl=0

se_gov 1532
se_sm 1711
se_trn 6837 - DROP DOCUMENT operation, not COMMITED, gdb thread 1 in
nanosleep
se_trn 11703 - LOAD STDIN operation, not COMMITED, gdb thread 1 in nanosleep
se_trn 11944 - document("$collections") XQuery, not COMMITED, gdb thread 1
in nanosleep
se_trn 11972 - document("$collections") XQuery, not COMMITED, gdb thread 1
in nanosleep
se_trn 12139 - XQuery collection("groupingDefinitions/wip"), COMMITED, gdb
thread 1 in select
se_trn 12183 - fast XQuery basictypes:getBasicType, not COMMITED, gdb
thread 1 in select
se_trn 12186 - XQuery collection("groupingDefinitions/wip"), COMMITED, gdb
thread 1 in select
se_trn 12195 - XQuery collection("groupingDefinitions/wip"), COMMITED, gdb
thread 1 in select
se_trn 1 - document("$collections") XQuery, not COMMITED, gdb thread 1
in nanosleep
se_trn 12302 - fast XQuery documents:getFilesByBasicTypeId, not COMMITED,
gdb thread 1 in nanosleep
se_trn 12307 - document("$documents") XQuery, not COMMITED, gdb thread 1 in
nanosleep
se_trn 12310 - document("$collections") XQuery, not COMMITED, gdb thread 1
in nanosleep
se_trn 12377 - Session is ready
se_trn 12382 - Session is ready
se_trn 12423 - Session is ready
se_trn 12848 - Session is ready

Comment:
I understand many of those XQueries are not optimal, they are part of an
application developed by other devs. Now I'll take care of improving those
operations to replace document("$documents") and document("$collections")
with doc-available and col-available accordingly and to use an index for
collection("groupingDefinitions/wip").

Overall conclusion:
There is no single operation/XQuery causing the database hanging. There
were at least three different XQueries observed. Taking into account the
seemingly random occurrences, it looks like the protocol issue indeed. I
don't see how to apply tcpdump in this case as it's Production environment.
Any ideas?

I'd try to rebuild the database from sources and, perhaps, to take the
latest night 3.5 (or 3.6?) build for this. Do you think it makes sense to
try?

Best regards,
Ivan Lagunov

On Fri, Oct 10, 2014 at 1:23 AM, Ilya Taranov  wrote:

> Hi

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-09 Thread Ilya Taranov
Hi, from the top of my head...

1. Queries are evaluated in a lazy fashion. Which means, that before client
asks to fetch the next row, nothing is evaluated.

So select() is a socket read, which is waiting for client to make a
decision on whether to commit (or end RO transaction) or ask for next part
of the result. All locks are only released when end transaction happens.
That's why others are waiting for one single client.

To my mind, it's a wrong approach, and that's not entirely how I would do
it right now, But that days we had ACID as a primary goal in minds.

2. Session is ready, means that there is no ongoing transaction, but the
connection is established.

3. No idea in this particular case, but there can be a situation when
session is started without client connection. Recovery/checkpoint, maybe
something else, I'm not sure.

4. That dfntly looks strange, but should not cause a lock. Once transaction
is committed, it should release all the locks. So even should there be
orphan sessions, they would not interfere the normal operation.

5. Yes that pretty much looks like a problem. Based on logs, it looks like
17338 is waiting for client to fetch a result.

That may be a protocol issue, by the way, when as a result of malformed
message both client and sedna are waiting for each other.

I would guess, that's always happens when "categorytree:getCategoryTree()"
is called, could you please try that? If it confirms, could you please make
a wireshark/tcpdump dump of this communication?


On Thu, Oct 9, 2014 at 6:42 AM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> Please check my investigations below.
>
> I've collected all the debug information for another hanging today, it's
> available here:
> https://www.dropbox.com/s/g14aer9821fzx05/sedna_debug_info6.zip?dl=0
> Here are the statuses of all Sedna processes:
>
> se_trn 13436 - DROP DOCUMENT operation, COMMITED
> se_trn 16245 - LOAD STDIN operation, COMMITED, no ESTABLISHED connection
> on port 5050 found (only 5151)
> se_trn 17335 - doc('$version') XQuery, COMMITED, gdb thread 1 in select ()
> from /lib64/libc.so.6
> se_trn 17338 - XQuery running about 10sec, not COMMITED, gdb thread 1 in
> select () from /lib64/libc.so.6
> se_trn 17410 - XQuery running <1sec, COMMITED
> se_trn 17462 - Session is ready
> se_trn 17554 - Session is ready
> se_trn 18892 - Session is ready
> se_trn 19124 - Session is ready
> se_gov 20441 - gdb threads 1 and 2 in select () from /lib64/libc.so.6
> se_sm 20449
>
> Comments:
> 1. Where gdb is not stated, all threads are in semop () from
> /lib64/libc.so.6 except the last thread that is in 
> pthread_cond_timedwait@@GLIBC_2.3.2
> (). I believe semop is a good state, correct me if I'm wrong.
> 2. All se_trn with "Session is ready" have no other entries in the latest
> event.log. As I collect only the latest event.log, I cannot check what was
> the last operation.
> 3. All se_trn processes have two ESTABLISHED connections in netstat for
> ports 5050 and 5151 if not stated otherwise.
> 4. se_trn 17338, 17410, 17462 were created by the same application that is
> under suspicion. The database unfreezed after the restart of that
> application.
>
> Issues/questions:
> 1. What exactly does it mean when a thread is in select()? It seems the
> database is waiting but for the client or for some internal operation, e.g.
> fetching the data. Please clarify.
> 2. What exactly does "Session is ready" mean? When does it appear in the
> event.log? I believe those connections are okay but I'm curious to know.
> 3. se_trn 16245 - how is it possible that there is no ESTABLISHED
> connection on port 5050 while the process is still alive? An issue?
> 4. se_trn 17335 - how is it possible that thread is in select() while the
> transaction has been COMMITED in event.log? An issue?
> 5. se_trn 17338 - looks like a root cause of Sedna hanging. Currently I
> run it via XML:DB API implementation mentioned earlier, more specifically I
> run the XQuery via XQueryService.query("xquery string") call. Besides, I
> always try to close the connection in the Java finally block, so it looks
> roughly like this:
>
> try {
> rootCollection = DatabaseManager.getCollection(...);
> ... do something ...
> } finally {
> rootCollection.close();
> }
>
> According to event.log the XQuery was started at 09/10/2014 12:00:01.
> In the application log somewhere between 12:01:53 and 12:05:00 appeared a
> message:
>
>> Warning: an open Sedna XML:DB Collection/Connection was just
>> automatically closed due to inactivity, you SHOULD close these manually.
>
> It seems to mean that the connection was considered as abandoned and
> removed by XML:DB API implementation (see above conn-log-abandoned=true
> and conn-remove-abandoned-timeout=12).
> For me it looks like the client waited for 2 minutes but did not get any
> response from the database. So there was an attempt to close the
> connection. It's easy to understand as the database hanged at about
> 12:00:02 when the last tra

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-09 Thread Ivan Lagunov
Hi Ilya,

Please check my investigations below.

I've collected all the debug information for another hanging today, it's
available here:
https://www.dropbox.com/s/g14aer9821fzx05/sedna_debug_info6.zip?dl=0
Here are the statuses of all Sedna processes:

se_trn 13436 - DROP DOCUMENT operation, COMMITED
se_trn 16245 - LOAD STDIN operation, COMMITED, no ESTABLISHED connection on
port 5050 found (only 5151)
se_trn 17335 - doc('$version') XQuery, COMMITED, gdb thread 1 in select ()
from /lib64/libc.so.6
se_trn 17338 - XQuery running about 10sec, not COMMITED, gdb thread 1 in
select () from /lib64/libc.so.6
se_trn 17410 - XQuery running <1sec, COMMITED
se_trn 17462 - Session is ready
se_trn 17554 - Session is ready
se_trn 18892 - Session is ready
se_trn 19124 - Session is ready
se_gov 20441 - gdb threads 1 and 2 in select () from /lib64/libc.so.6
se_sm 20449

Comments:
1. Where gdb is not stated, all threads are in semop () from
/lib64/libc.so.6 except the last thread that is in
pthread_cond_timedwait@@GLIBC_2.3.2
(). I believe semop is a good state, correct me if I'm wrong.
2. All se_trn with "Session is ready" have no other entries in the latest
event.log. As I collect only the latest event.log, I cannot check what was
the last operation.
3. All se_trn processes have two ESTABLISHED connections in netstat for
ports 5050 and 5151 if not stated otherwise.
4. se_trn 17338, 17410, 17462 were created by the same application that is
under suspicion. The database unfreezed after the restart of that
application.

Issues/questions:
1. What exactly does it mean when a thread is in select()? It seems the
database is waiting but for the client or for some internal operation, e.g.
fetching the data. Please clarify.
2. What exactly does "Session is ready" mean? When does it appear in the
event.log? I believe those connections are okay but I'm curious to know.
3. se_trn 16245 - how is it possible that there is no ESTABLISHED
connection on port 5050 while the process is still alive? An issue?
4. se_trn 17335 - how is it possible that thread is in select() while the
transaction has been COMMITED in event.log? An issue?
5. se_trn 17338 - looks like a root cause of Sedna hanging. Currently I run
it via XML:DB API implementation mentioned earlier, more specifically I run
the XQuery via XQueryService.query("xquery string") call. Besides, I always
try to close the connection in the Java finally block, so it looks roughly
like this:

try {
rootCollection = DatabaseManager.getCollection(...);
... do something ...
} finally {
rootCollection.close();
}

According to event.log the XQuery was started at 09/10/2014 12:00:01.
In the application log somewhere between 12:01:53 and 12:05:00 appeared a
message:

> Warning: an open Sedna XML:DB Collection/Connection was just automatically
> closed due to inactivity, you SHOULD close these manually.

It seems to mean that the connection was considered as abandoned and
removed by XML:DB API implementation (see above conn-log-abandoned=true and
conn-remove-abandoned-timeout=12).
For me it looks like the client waited for 2 minutes but did not get any
response from the database. So there was an attempt to close the
connection. It's easy to understand as the database hanged at about
12:00:02 when the last transaction was COMMITED. So it may be even that the
root cause is hidden somewhere deeper. For example, this long XQuery could
cause a deadlock.

I understand it may be an issue in XML:DB API implementation but its
support was stopped by Charles Foster. Unfortunately, we cannot afford to
migrate to XQJ API implementation yet. The most tricky part to understand
is why we had not faced similar issues before recent occurrence when it has
become a regular major issue immediately. There were no server updates and
no changes that could affect the way transactions/connections are handled.

Best regards,
Ivan Lagunov

On Thu, Oct 9, 2014 at 1:33 PM, Ivan Lagunov  wrote:

> Sorry, copied the wrong link for the connection pool documentation. Here
> is the correct one:
> http://www.cfoster.net/pdf/sedna/xmldb/documentation.pdf#page=10
>
> Besides, I use the following values for the connection pool parameters in
> the suspected application:
> conn-max-active=50
> conn-max-wait=3
> conn-log-abandoned=true
> conn-remove-abandoned-timeout=12
> conn-abandoned-check-interval=5000
>
> Best regards,
> Ivan Lagunov
>
> On Thu, Oct 9, 2014 at 1:28 PM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> I use Charles Foster implementation of XML:DB API, version 1.2.6 without
>> any customizations: http://www.cfoster.net/sedna/xmldb/download/
>>
>> Here is the link again to the connection pool documentation (it's a part
>> of the above implementation):
>> http://www.cfoster.net/sedna/xmldb/download/
>>
>> Meanwhile, I've identified the application that makes Sedna wait. At
>> least restarting the application has unfreezed the database twice already.
>> I'll analyze the application log along with Sedna event

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-09 Thread Ivan Lagunov
Sorry, copied the wrong link for the connection pool documentation. Here is
the correct one:
http://www.cfoster.net/pdf/sedna/xmldb/documentation.pdf#page=10

Besides, I use the following values for the connection pool parameters in
the suspected application:
conn-max-active=50
conn-max-wait=3
conn-log-abandoned=true
conn-remove-abandoned-timeout=12
conn-abandoned-check-interval=5000

Best regards,
Ivan Lagunov

On Thu, Oct 9, 2014 at 1:28 PM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> I use Charles Foster implementation of XML:DB API, version 1.2.6 without
> any customizations: http://www.cfoster.net/sedna/xmldb/download/
>
> Here is the link again to the connection pool documentation (it's a part
> of the above implementation): http://www.cfoster.net/sedna/xmldb/download/
>
> Meanwhile, I've identified the application that makes Sedna wait. At least
> restarting the application has unfreezed the database twice already. I'll
> analyze the application log along with Sedna event.log shortly. Hopefully,
> I'll find the root cause.
>
> Best regards,
> Ivan Lagunov
>
> On Thu, Oct 9, 2014 at 3:50 AM, Ilya Taranov  wrote:
>
>> I can easily see the scenario, where one of the clients doesn't terminate
>> the transaction properly. Unfortunately, current application timeout
>> doesn't apply for client hangs. The issue can be easily fixed, however.
>>
>> Unfortunately, even fix still would not solve your problem, you would
>> still have application timeout wait on locks. Probably the most
>> straightforward way to find the cause for you would be to log connection
>> checkins/checkouts at the client driver level.
>>
>> Could you please give me a link to the connection pool + driver you
>> currently use (including all your changes if possible) so I could help you
>> with that?
>>
>> On Tue, Oct 7, 2014 at 4:37 AM, Ivan Lagunov  wrote:
>>
>>> Hi Ilya,
>>>
>>> Your suggestion seems to be correct. After I've restarted three Java
>>> applications that had connections to Sedna, the database has resumed
>>> processing operations. Here is the event.log where Sedna has hanged between
>>> 12:35:03 and 13:06:05:
>>>
>>>
>>> https://www.dropbox.com/s/pzafuvzcvfchxwa/event-2014-10-07-13-15-04-018.zip?dl=0
>>>
>>> I'm not sure which of the applications caused it though. It still
>>> worries me as the queries before hanging look quite simple and I cannot
>>> understand which client operation could have hanged the whole database.
>>>
>>> Best regards,
>>> Ivan Lagunov
>>>
>>> On Mon, Oct 6, 2014 at 12:29 PM, Ivan Lagunov  wrote:
>>>
 Hi Ilya,

 Yes, I use a connection pool via Sedna XML:DB API implementation. It's
 documented here:

 http://www.cfoster.net/pdf/sedna/xmldb/documentation.pdf#page=10

 Unfortunately, there is no support for active connection timeout. This
 could be the cause if there was a query hanging for some reason. Still if
 some query hangs forever (that should not as it's in my case) it looks more
 like a database bug/deadlock situation.

 I'll try as you suggest next time.

 P.S. Sorry for the second email as I've forgotten to add CC first.

 Best regards,
 Ivan Lagunov

 On Fri, Oct 3, 2014 at 10:23 PM, Ilya Taranov 
 wrote:

> Ivan, do you have a connection pool? I see one of the se_trn processes
> waiting for socket read.
>
> If you don't have a connection pool, it just might be a client
> problem. I'll take a deeper look anyway, just next time it happens, could
> you please try to shutdown the client thread, if it's even possible.
>
> On Thu, Oct 2, 2014 at 5:30 AM, Ivan Lagunov 
> wrote:
>
>> Hi all,
>>
>> I've been facing major issues with Sedna hanging regularly for the
>> last two weeks (after several months of stable work). I tried to do the
>> complete export-import of data using se_exp but Sedna still hangs
>> regularly. Looking into event.log does not help as queries differ every
>> time. Sometimes it looks like Sedna hangs after some heavy query (with 
>> huge
>> response) while in other cases there are no heavy queries to suspect. 
>> I've
>> written a script to collect the following logs/dumps you will find in the
>> archives:
>>
>> event.log - the last Sedna event log till the moment DB hangs and is
>> stopped;
>> sedna_procs.txt - list of all Sedna processes (ps -ef);
>> netstat.txt - list of all ESTABLISHED connections;
>> ipcs.txt - results of ipcs -s;
>> .txt - results of gdb with bt for all threads of a Sedna process
>> (available for all processes at the moment of hanging). This is collected
>> automatically for threads 1-4 of each process.
>>
>> I'm providing two sets of logs/dumps for two separate hanging events:
>>
>> 25th September - seemed to hang without any heavy queries:
>> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>>
>>>

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-09 Thread Ivan Lagunov
Hi Ilya,

I use Charles Foster implementation of XML:DB API, version 1.2.6 without
any customizations: http://www.cfoster.net/sedna/xmldb/download/

Here is the link again to the connection pool documentation (it's a part of
the above implementation): http://www.cfoster.net/sedna/xmldb/download/

Meanwhile, I've identified the application that makes Sedna wait. At least
restarting the application has unfreezed the database twice already. I'll
analyze the application log along with Sedna event.log shortly. Hopefully,
I'll find the root cause.

Best regards,
Ivan Lagunov

On Thu, Oct 9, 2014 at 3:50 AM, Ilya Taranov  wrote:

> I can easily see the scenario, where one of the clients doesn't terminate
> the transaction properly. Unfortunately, current application timeout
> doesn't apply for client hangs. The issue can be easily fixed, however.
>
> Unfortunately, even fix still would not solve your problem, you would
> still have application timeout wait on locks. Probably the most
> straightforward way to find the cause for you would be to log connection
> checkins/checkouts at the client driver level.
>
> Could you please give me a link to the connection pool + driver you
> currently use (including all your changes if possible) so I could help you
> with that?
>
> On Tue, Oct 7, 2014 at 4:37 AM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> Your suggestion seems to be correct. After I've restarted three Java
>> applications that had connections to Sedna, the database has resumed
>> processing operations. Here is the event.log where Sedna has hanged between
>> 12:35:03 and 13:06:05:
>>
>>
>> https://www.dropbox.com/s/pzafuvzcvfchxwa/event-2014-10-07-13-15-04-018.zip?dl=0
>>
>> I'm not sure which of the applications caused it though. It still worries
>> me as the queries before hanging look quite simple and I cannot understand
>> which client operation could have hanged the whole database.
>>
>> Best regards,
>> Ivan Lagunov
>>
>> On Mon, Oct 6, 2014 at 12:29 PM, Ivan Lagunov  wrote:
>>
>>> Hi Ilya,
>>>
>>> Yes, I use a connection pool via Sedna XML:DB API implementation. It's
>>> documented here:
>>>
>>> http://www.cfoster.net/pdf/sedna/xmldb/documentation.pdf#page=10
>>>
>>> Unfortunately, there is no support for active connection timeout. This
>>> could be the cause if there was a query hanging for some reason. Still if
>>> some query hangs forever (that should not as it's in my case) it looks more
>>> like a database bug/deadlock situation.
>>>
>>> I'll try as you suggest next time.
>>>
>>> P.S. Sorry for the second email as I've forgotten to add CC first.
>>>
>>> Best regards,
>>> Ivan Lagunov
>>>
>>> On Fri, Oct 3, 2014 at 10:23 PM, Ilya Taranov 
>>> wrote:
>>>
 Ivan, do you have a connection pool? I see one of the se_trn processes
 waiting for socket read.

 If you don't have a connection pool, it just might be a client problem.
 I'll take a deeper look anyway, just next time it happens, could you please
 try to shutdown the client thread, if it's even possible.

 On Thu, Oct 2, 2014 at 5:30 AM, Ivan Lagunov  wrote:

> Hi all,
>
> I've been facing major issues with Sedna hanging regularly for the
> last two weeks (after several months of stable work). I tried to do the
> complete export-import of data using se_exp but Sedna still hangs
> regularly. Looking into event.log does not help as queries differ every
> time. Sometimes it looks like Sedna hangs after some heavy query (with 
> huge
> response) while in other cases there are no heavy queries to suspect. I've
> written a script to collect the following logs/dumps you will find in the
> archives:
>
> event.log - the last Sedna event log till the moment DB hangs and is
> stopped;
> sedna_procs.txt - list of all Sedna processes (ps -ef);
> netstat.txt - list of all ESTABLISHED connections;
> ipcs.txt - results of ipcs -s;
> .txt - results of gdb with bt for all threads of a Sedna process
> (available for all processes at the moment of hanging). This is collected
> automatically for threads 1-4 of each process.
>
> I'm providing two sets of logs/dumps for two separate hanging events:
>
> 25th September - seemed to hang without any heavy queries:
> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>
> 2nd October - hanged after a heavy query:
> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
> The heavy query occurs at LOG   02/10/2014 11:10:20
>
> If you can find out the cause of the hangings, it'd be the best.
> Otherwise it'd be nice to hear some recommendations at least on how to
> avoid these issues/minimize an impact. May be I need to upgrade to some
> later version but I'm currently using 3.5.615 that is already a night 
> build
> that we had to upgrade to after some older issue (reported back in
> March-April of 2013). I'd try some 

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-08 Thread Ilya Taranov
I can easily see the scenario, where one of the clients doesn't terminate
the transaction properly. Unfortunately, current application timeout
doesn't apply for client hangs. The issue can be easily fixed, however.

Unfortunately, even fix still would not solve your problem, you would still
have application timeout wait on locks. Probably the most straightforward
way to find the cause for you would be to log connection checkins/checkouts
at the client driver level.

Could you please give me a link to the connection pool + driver you
currently use (including all your changes if possible) so I could help you
with that?

On Tue, Oct 7, 2014 at 4:37 AM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> Your suggestion seems to be correct. After I've restarted three Java
> applications that had connections to Sedna, the database has resumed
> processing operations. Here is the event.log where Sedna has hanged between
> 12:35:03 and 13:06:05:
>
>
> https://www.dropbox.com/s/pzafuvzcvfchxwa/event-2014-10-07-13-15-04-018.zip?dl=0
>
> I'm not sure which of the applications caused it though. It still worries
> me as the queries before hanging look quite simple and I cannot understand
> which client operation could have hanged the whole database.
>
> Best regards,
> Ivan Lagunov
>
> On Mon, Oct 6, 2014 at 12:29 PM, Ivan Lagunov  wrote:
>
>> Hi Ilya,
>>
>> Yes, I use a connection pool via Sedna XML:DB API implementation. It's
>> documented here:
>>
>> http://www.cfoster.net/pdf/sedna/xmldb/documentation.pdf#page=10
>>
>> Unfortunately, there is no support for active connection timeout. This
>> could be the cause if there was a query hanging for some reason. Still if
>> some query hangs forever (that should not as it's in my case) it looks more
>> like a database bug/deadlock situation.
>>
>> I'll try as you suggest next time.
>>
>> P.S. Sorry for the second email as I've forgotten to add CC first.
>>
>> Best regards,
>> Ivan Lagunov
>>
>> On Fri, Oct 3, 2014 at 10:23 PM, Ilya Taranov 
>> wrote:
>>
>>> Ivan, do you have a connection pool? I see one of the se_trn processes
>>> waiting for socket read.
>>>
>>> If you don't have a connection pool, it just might be a client problem.
>>> I'll take a deeper look anyway, just next time it happens, could you please
>>> try to shutdown the client thread, if it's even possible.
>>>
>>> On Thu, Oct 2, 2014 at 5:30 AM, Ivan Lagunov  wrote:
>>>
 Hi all,

 I've been facing major issues with Sedna hanging regularly for the last
 two weeks (after several months of stable work). I tried to do the complete
 export-import of data using se_exp but Sedna still hangs regularly. Looking
 into event.log does not help as queries differ every time. Sometimes it
 looks like Sedna hangs after some heavy query (with huge response) while in
 other cases there are no heavy queries to suspect. I've written a script to
 collect the following logs/dumps you will find in the archives:

 event.log - the last Sedna event log till the moment DB hangs and is
 stopped;
 sedna_procs.txt - list of all Sedna processes (ps -ef);
 netstat.txt - list of all ESTABLISHED connections;
 ipcs.txt - results of ipcs -s;
 .txt - results of gdb with bt for all threads of a Sedna process
 (available for all processes at the moment of hanging). This is collected
 automatically for threads 1-4 of each process.

 I'm providing two sets of logs/dumps for two separate hanging events:

 25th September - seemed to hang without any heavy queries:
 https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0

 2nd October - hanged after a heavy query:
 https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
 The heavy query occurs at LOG   02/10/2014 11:10:20

 If you can find out the cause of the hangings, it'd be the best.
 Otherwise it'd be nice to hear some recommendations at least on how to
 avoid these issues/minimize an impact. May be I need to upgrade to some
 later version but I'm currently using 3.5.615 that is already a night build
 that we had to upgrade to after some older issue (reported back in
 March-April of 2013). I'd try some 3.5 or 3.6 version but without knowing
 the reason there is no guarantee the potential bug is fixed in the later
 night build.

 Also it'd be really nice if there was an official release for 3.5 with
 all the latest bug fixes (I see the last night build is 3.5.627 of 8th May
 2013).

 Best regards,
 Ivan Lagunov


 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer

 http://pubad

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-07 Thread Ivan Lagunov
Hi Ilya,

Your suggestion seems to be correct. After I've restarted three Java
applications that had connections to Sedna, the database has resumed
processing operations. Here is the event.log where Sedna has hanged between
12:35:03 and 13:06:05:

https://www.dropbox.com/s/pzafuvzcvfchxwa/event-2014-10-07-13-15-04-018.zip?dl=0

I'm not sure which of the applications caused it though. It still worries
me as the queries before hanging look quite simple and I cannot understand
which client operation could have hanged the whole database.

Best regards,
Ivan Lagunov

On Mon, Oct 6, 2014 at 12:29 PM, Ivan Lagunov  wrote:

> Hi Ilya,
>
> Yes, I use a connection pool via Sedna XML:DB API implementation. It's
> documented here:
>
> http://www.cfoster.net/pdf/sedna/xmldb/documentation.pdf#page=10
>
> Unfortunately, there is no support for active connection timeout. This
> could be the cause if there was a query hanging for some reason. Still if
> some query hangs forever (that should not as it's in my case) it looks more
> like a database bug/deadlock situation.
>
> I'll try as you suggest next time.
>
> P.S. Sorry for the second email as I've forgotten to add CC first.
>
> Best regards,
> Ivan Lagunov
>
> On Fri, Oct 3, 2014 at 10:23 PM, Ilya Taranov 
> wrote:
>
>> Ivan, do you have a connection pool? I see one of the se_trn processes
>> waiting for socket read.
>>
>> If you don't have a connection pool, it just might be a client problem.
>> I'll take a deeper look anyway, just next time it happens, could you please
>> try to shutdown the client thread, if it's even possible.
>>
>> On Thu, Oct 2, 2014 at 5:30 AM, Ivan Lagunov  wrote:
>>
>>> Hi all,
>>>
>>> I've been facing major issues with Sedna hanging regularly for the last
>>> two weeks (after several months of stable work). I tried to do the complete
>>> export-import of data using se_exp but Sedna still hangs regularly. Looking
>>> into event.log does not help as queries differ every time. Sometimes it
>>> looks like Sedna hangs after some heavy query (with huge response) while in
>>> other cases there are no heavy queries to suspect. I've written a script to
>>> collect the following logs/dumps you will find in the archives:
>>>
>>> event.log - the last Sedna event log till the moment DB hangs and is
>>> stopped;
>>> sedna_procs.txt - list of all Sedna processes (ps -ef);
>>> netstat.txt - list of all ESTABLISHED connections;
>>> ipcs.txt - results of ipcs -s;
>>> .txt - results of gdb with bt for all threads of a Sedna process
>>> (available for all processes at the moment of hanging). This is collected
>>> automatically for threads 1-4 of each process.
>>>
>>> I'm providing two sets of logs/dumps for two separate hanging events:
>>>
>>> 25th September - seemed to hang without any heavy queries:
>>> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>>>
>>> 2nd October - hanged after a heavy query:
>>> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
>>> The heavy query occurs at LOG   02/10/2014 11:10:20
>>>
>>> If you can find out the cause of the hangings, it'd be the best.
>>> Otherwise it'd be nice to hear some recommendations at least on how to
>>> avoid these issues/minimize an impact. May be I need to upgrade to some
>>> later version but I'm currently using 3.5.615 that is already a night build
>>> that we had to upgrade to after some older issue (reported back in
>>> March-April of 2013). I'd try some 3.5 or 3.6 version but without knowing
>>> the reason there is no guarantee the potential bug is fixed in the later
>>> night build.
>>>
>>> Also it'd be really nice if there was an official release for 3.5 with
>>> all the latest bug fixes (I see the last night build is 3.5.627 of 8th May
>>> 2013).
>>>
>>> Best regards,
>>> Ivan Lagunov
>>>
>>>
>>> --
>>> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
>>> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
>>> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
>>> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
>>> ___
>>> Sedna-discussion mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>>>
>>>
>>
>
--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk___
Sedna-discussion mailing list
Sedna-discussi

Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-06 Thread Ivan Lagunov
Hi Ilya,

Yes, I use a connection pool via Sedna XML:DB API implementation. It's
documented here:

http://www.cfoster.net/pdf/sedna/xmldb/documentation.pdf#page=10

Unfortunately, there is no support for active connection timeout. This
could be the cause if there was a query hanging for some reason. Still if
some query hangs forever (that should not as it's in my case) it looks more
like a database bug/deadlock situation.

I'll try as you suggest next time.

P.S. Sorry for the second email as I've forgotten to add CC first.

Best regards,
Ivan Lagunov

On Fri, Oct 3, 2014 at 10:23 PM, Ilya Taranov  wrote:

> Ivan, do you have a connection pool? I see one of the se_trn processes
> waiting for socket read.
>
> If you don't have a connection pool, it just might be a client problem.
> I'll take a deeper look anyway, just next time it happens, could you please
> try to shutdown the client thread, if it's even possible.
>
> On Thu, Oct 2, 2014 at 5:30 AM, Ivan Lagunov  wrote:
>
>> Hi all,
>>
>> I've been facing major issues with Sedna hanging regularly for the last
>> two weeks (after several months of stable work). I tried to do the complete
>> export-import of data using se_exp but Sedna still hangs regularly. Looking
>> into event.log does not help as queries differ every time. Sometimes it
>> looks like Sedna hangs after some heavy query (with huge response) while in
>> other cases there are no heavy queries to suspect. I've written a script to
>> collect the following logs/dumps you will find in the archives:
>>
>> event.log - the last Sedna event log till the moment DB hangs and is
>> stopped;
>> sedna_procs.txt - list of all Sedna processes (ps -ef);
>> netstat.txt - list of all ESTABLISHED connections;
>> ipcs.txt - results of ipcs -s;
>> .txt - results of gdb with bt for all threads of a Sedna process
>> (available for all processes at the moment of hanging). This is collected
>> automatically for threads 1-4 of each process.
>>
>> I'm providing two sets of logs/dumps for two separate hanging events:
>>
>> 25th September - seemed to hang without any heavy queries:
>> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>>
>> 2nd October - hanged after a heavy query:
>> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
>> The heavy query occurs at LOG   02/10/2014 11:10:20
>>
>> If you can find out the cause of the hangings, it'd be the best.
>> Otherwise it'd be nice to hear some recommendations at least on how to
>> avoid these issues/minimize an impact. May be I need to upgrade to some
>> later version but I'm currently using 3.5.615 that is already a night build
>> that we had to upgrade to after some older issue (reported back in
>> March-April of 2013). I'd try some 3.5 or 3.6 version but without knowing
>> the reason there is no guarantee the potential bug is fixed in the later
>> night build.
>>
>> Also it'd be really nice if there was an official release for 3.5 with
>> all the latest bug fixes (I see the last night build is 3.5.627 of 8th May
>> 2013).
>>
>> Best regards,
>> Ivan Lagunov
>>
>>
>> --
>> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
>> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
>> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
>> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
>> ___
>> Sedna-discussion mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>>
>>
>
--
Slashdot TV.  Videos for Nerds.  Stuff that Matters.
http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk___
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion


Re: [Sedna-discussion] Sedna 3.5.615 repeated hanging

2014-10-03 Thread Ilya Taranov
Ivan, do you have a connection pool? I see one of the se_trn processes
waiting for socket read.

If you don't have a connection pool, it just might be a client problem.
I'll take a deeper look anyway, just next time it happens, could you please
try to shutdown the client thread, if it's even possible.

On Thu, Oct 2, 2014 at 5:30 AM, Ivan Lagunov  wrote:

> Hi all,
>
> I've been facing major issues with Sedna hanging regularly for the last
> two weeks (after several months of stable work). I tried to do the complete
> export-import of data using se_exp but Sedna still hangs regularly. Looking
> into event.log does not help as queries differ every time. Sometimes it
> looks like Sedna hangs after some heavy query (with huge response) while in
> other cases there are no heavy queries to suspect. I've written a script to
> collect the following logs/dumps you will find in the archives:
>
> event.log - the last Sedna event log till the moment DB hangs and is
> stopped;
> sedna_procs.txt - list of all Sedna processes (ps -ef);
> netstat.txt - list of all ESTABLISHED connections;
> ipcs.txt - results of ipcs -s;
> .txt - results of gdb with bt for all threads of a Sedna process
> (available for all processes at the moment of hanging). This is collected
> automatically for threads 1-4 of each process.
>
> I'm providing two sets of logs/dumps for two separate hanging events:
>
> 25th September - seemed to hang without any heavy queries:
> https://www.dropbox.com/s/2luq53cstc4t3tl/sedna_debug_info1.zip?dl=0
>
> 2nd October - hanged after a heavy query:
> https://www.dropbox.com/s/scmszq9yphyv6xf/sedna_debug_info2.zip?dl=0
> The heavy query occurs at LOG   02/10/2014 11:10:20
>
> If you can find out the cause of the hangings, it'd be the best. Otherwise
> it'd be nice to hear some recommendations at least on how to avoid these
> issues/minimize an impact. May be I need to upgrade to some later version
> but I'm currently using 3.5.615 that is already a night build that we had
> to upgrade to after some older issue (reported back in March-April of
> 2013). I'd try some 3.5 or 3.6 version but without knowing the reason there
> is no guarantee the potential bug is fixed in the later night build.
>
> Also it'd be really nice if there was an official release for 3.5 with all
> the latest bug fixes (I see the last night build is 3.5.627 of 8th May
> 2013).
>
> Best regards,
> Ivan Lagunov
>
>
> --
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>
> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
> ___
> Sedna-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>
>
--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk___
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion