I am running a server instance under Amazon EC2 with MonetDB 5 (latest released 
version) installed.

If I boot up the instance and do NOT start or use Monetdb, then I can issue a 
reboot command at any time, and the system reboots just fine and comes back up.

However if I issue the commands:

mkdir -p /mnt/MonetDB5/dbfarm
merovingian
monetdb create demo
monetdb start demo
mclient -lsql --time --database=demo
sql> \q

Then when I go to reboot I get the following output in the console and then 
things hang, and the instance never reboots. Here is the console output I get 
(more notes on the issue following this output):

INIT: Switching to runlevel: 6
INIT: Sending processes the TERM signal
Stopping ConsoleKit: [  OK  ]
Stopping sshd: [  OK  ]
Stopping crond: [  OK  ]
Stopping system message bus: [  OK  ]
Shutting down kernel logger: [  OK  ]
Shutting down system logger: [  OK  ]
Shutting down interface eth0:  [  OK  ]
Shutting down loopback interface:  [  OK  ]
iptables: Flushing firewall rules: [  OK  ]
iptables: Setting chains to policy ACCEPT: filter [  OK  ]
iptables: Unloading modules: [  OK  ]
Starting killall:  [  OK  ]
Sending all processes the TERM signal... ------------[ cut here ]------------
kernel BUG at include/linux/tracehook.h:369!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /class/net/sit0/address
Modules linked in: sit(U) tunnel4(U) fuse(U) ipv6(U) dm_mirror(U) 
dm_multipath(U) dm_mod(U) pcspkr(U) ext3(U) jbd(U) mbcache(U) uhci_hcd(U) 
ohci_hcd(U) ehci_hcd(U) xenblk(U) xennet(U)
CPU:    0
EIP:    0061:[<c10254df>]    Not tainted VLI
EFLAGS: 00210087   (2.6.21.7-2.fc8xen #1)
EIP is at release_task+0x38/0x2f1
eax: c12ff000   ebx: c2b9f450   ecx: f5416000   edx: 00000000
esi: c2b9f450   edi: 00000000   ebp: 00000001   esp: ed207e68
ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0069
Process mserver5 (pid: 1124, ti=ed207000 task=c1404910 task.ti=ed207000)
Stack: 00000020 c1404910 00000000 ed7c15b0 c1026918 00000009 00000008 0007b6c0
       c1404d64 c1404d64 c14049c4 00000000 c14f30e4 c188bac0 00000000 ed207fb8
       c10269fb c14f30e4 00000009 b438a364 c102dcb2 00000000 00000000 00000000
Call Trace:
 [<c1026918>] do_exit+0x6d6/0x730
 [<c10269fb>] sys_exit_group+0x0/0xd
 [<c102dcb2>] get_signal_to_deliver+0x3d2/0x414
 [<c1004be7>] do_notify_resume+0x8c/0x6e6
 [<c12038ae>] do_page_fault+0x7a1/0xc24
 [<c10e48a8>] copy_to_user+0x3c/0x50
 [<c107d766>] sys_select+0x161/0x187
 [<c1005765>] work_notifysig+0x13/0x1a
 =======================
Code: 98 04 00 00 00 74 07 89 f0 e8 e6 98 02 00 8b 86 80 01 00 00 90 ff 48 04 
b8 00 f0 2f c1 e8 c5 c7 1d 00 83 be 90 00 00 00 20 74 04 <0f> 0b eb fe 83 be 98 
04 00 00 00 74 27 8d 86 a8 04 00 00 e8 ee
EIP: [<c10254df>] release_task+0x38/0x2f1 SS:ESP 0069:ed207e68
Fixing recursive fault but reboot is needed!
---------------------------------------------------

Note that if I run these same commands

mkdir -p /mnt/MonetDB5/dbfarm
merovingian
monetdb create demo
monetdb start demo
# OMIT THIS TIME mclient -lsql --time --database=demo

But if I DO NOT run mclient, then the problem does not occur and it will reboot 
fine. However once I run mclient then I am guaranteed to lock up on a reboot 
with the console output as shown above.

What is really bad about this issue in particular is that on EC2 if an instance 
will not reboot it needs to be terminated.  And when terminated all data on the 
instance is completely detroyed!  So I do not get a second chance - once the 
server is locked out like this it has to be destroyed and a new one built.

This is running on 32 bit Fedora Core 8.

What is causing this and how can I fix it? Thanks.



      


-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Monetdb-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-developers

Reply via email to