Bugs item #2606020, was opened at 2009-02-16 16:59
Message generated for change (Comment added) made by skinkie
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2606020&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: SQL/Core
Group: SQL "stable"
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Stefan de Konink (skinkie)
Assigned to: Niels Nes (nielsnes)
Summary: BenchmarkSQL fails loading due to sudden transaction abort

Initial Comment:
In order to let BenchmarkSQL to work, I had to modify the test from 
Decimal(4,4) to Float in the create table statements. When running the 
loadSQL.sh code with the following configuration:

monetdb-jdbc.properties: 
driver=nl.cwi.monetdb.jdbc.MonetDriver
conn=jdbc:monetdb://localhost/demo?user=monetdb&password=monetdb&debug=true
user=monetdb
password=monetdb

At a certain moment when executing EXEC with its PREPARE statement, ROLLBACK 
warnings occur.

PREPARE INSERT INTO item  (i_id, i_name, i_price, i_data, i_im_id) VALUES (?, 
?, ?, ?, ?);

TX 1234798982371: sexec 2(22001, 'jvrRZibeQvUYlNcsAZ', 50.689998626708984, 
'pYeGnMiVfmWoUBbVjgoPiPueMBVTxnImYXHbBNkyVBmI', 8936);
....

RD 1234798982597: read final block: 3950 bytes
RX 1234798982597: !current transaction is aborted (please ROLLBACK)

I have ran valgrind on the mserver5 process which resulted in:
>==16992== Thread 5:
==16992== Syscall param write(buf) points to uninitialised byte(s)
==16992==    at 0x6AF37FB: (within /lib64/libpthread-2.9.so)
==16992==    by 0x5DA716E: GDKsave (gdk_storage.mx:284)
==16992==    by 0x5B55694: HEAPsave (gdk_heap.mx:383)
==16992==    by 0x5DA93D9: BATsave (gdk_storage.mx:559)
==16992==    by 0x5B52DBD: BBPsync (gdk_bbp.mx:3133)
==16992==    by 0x5B3C78F: TMsubcommit (gdk_tm.mx:189)
==16992==    by 0x5EC467D: bm_subcommit (gdk_logger.mx:1011)
==16992==    by 0x5EC772B: bm_commit (gdk_logger.mx:1735)
==16992==    by 0x5EC3A24: logger_commit (gdk_logger.mx:949)
==16992==    by 0x5EC5405: logger_exit (gdk_logger.mx:1226)
==16992==    by 0x5EC551A: logger_restart (gdk_logger.mx:1253)
==16992==    by 0x18C1F193: bl_restart (bat_logger.mx:64)
==16992==  Address 0x1c1e4441 is 8,353 bytes inside a block of size 20,504 
alloc'd
==16992==    at 0x4C2467E: malloc (in 
/usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so)
==16992==    by 0x5C1710D: GDKmallocmax (gdk_utils.mx:1062)
==16992==    by 0x5C172AA: GDKmalloc (gdk_utils.mx:1085)
==16992==    by 0x5DA73EE: GDKload (gdk_storage.mx:330)
==16992==    by 0x5B5550B: HEAPload (gdk_heap.mx:341)
==16992==    by 0x5DA9AE0: BATload_intern (gdk_storage.mx:643)
==16992==    by 0x5B4B849: BBPdescriptor (gdk_bbp.mx:1864)
==16992==    by 0x18C1F812: BATdescriptor (gdk.h:1324)
==16992==    by 0x18C1F786: temp_descriptor (bat_utils.mx:123)
==16992==    by 0x18C1EC76: tr_snapshot_bat (bat_storage.mx:1405)
==16992==    by 0x18C1EDB8: snapshot_table (bat_storage.mx:1434)
==16992==    by 0x18C11EE5: rollforward_update_table (store.mx:2289)


----------------------------------------------------------------------

>Comment By: Stefan de Konink (skinkie)
Date: 2009-02-16 23:10

Message:
For everyone (from cwi) that would like to reproduce this thing; on loki in
/tmp/benchmark.tar.bz2 can be found.

- create a demo database
- extract this file
- cd run
- sh runSQL.sh sqlTableCreates  (some expected trunkcate fails)
- sh loadData.sh monetdb-jdbc.properties

Be VERY sure you run this code with Java 1.5; 1.6 fails.

--

Something that I found out too...

==17410== 31,896 (11,752 direct, 20,144 indirect) bytes in 242 blocks are
definitely lost in loss record 28 of 33
==17410==    at 0x4C2467E: malloc (in
/usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so)
==17410==    by 0x5C1710D: GDKmallocmax (gdk_utils.mx:1062)
==17410==    by 0x5C172AA: GDKmalloc (gdk_utils.mx:1085)
==17410==    by 0x5C17864: GDKstrdup (gdk_utils.mx:1235)
==17410==    by 0x5E873E8: VALcopy (gdk_value.mx:162)
==17410==    by 0x4E49D30: callMAL (mal_interpreter.mx:221)
==17410==    by 0x18B93A9A: SQLexecutePrepared (sql_scenario.mx:1185)
==17410==    by 0x18B93D0F: SQLengineIntern (sql_scenario.mx:1238)
==17410==    by 0x18B941E2: SQLengine (sql_scenario.mx:1348)
==17410==    by 0x4E83ACB: runPhase (mal_scenario.mx:591)
==17410==    by 0x4E83C5E: runScenarioBody (mal_scenario.mx:636)
==17410==    by 0x4E83DF8: runScenario (mal_scenario.mx:661)
==17410==    by 0x4E47CCA: MSserveClient (mal_session.mx:459)
==17410==    by 0x5EC05E7: wrapper_routine (gdk_posix.mx:940)
==17410==    by 0x6AED006: start_thread (in /lib64/libpthread-2.9.so)
==17410==    by 0x7A1938C: clone (in /lib64/libc-2.9.so)

--

I have also created a different test build from the queries that are
actually inserted. This can be found as /tmp/prepare-exec.tar.bz2 on loki.
This creates a list with:

MAPI  = mone...@localhost:50000
QUERY = exec 1(4640, 'HQiudJRsmjTrvpyVnOym', 41.880001068115234,
'dwkXYLpgwwZxSuCGDWMoqiCLegJfKXUykvkRvevmfEt', 9335)
ERROR = !no prepared statement with the given id

(exec 0..8 were defined)

The thing is, that these inserts in mclient do not create the previously
issued valgrind traces. Then again, they do not rollback...

----------------------------------------------------------------------

Comment By: Stefan de Konink (skinkie)
Date: 2009-02-16 19:48

Message:
  Elasped Time(ms): 4.089       Writing record 41000 of 100000
Error(s) occurred while executing the batch, see next SQLExceptions for
details

Start District Data for 10 Dists @ Mon Feb 16 19:45:22 CET 2009 ...
current transaction is aborted (please ROLLBACK)

Start Cust-Hist Load for 60000 Cust-Hists @ Mon Feb 16 19:45:22 CET 2009
...
  Elasped Time(ms): 2.78        Writing record 1000 of 60000
Error(s) occurred while executing the batch, see next SQLExceptions for
details
whse=1, dist=10, cust=3000

Start Order-Line-New Load for approx 340000 rows @ Mon Feb 16 19:45:24 CET
2009 
...
  Elasped Time(ms): 1.461       Writing record 2000 of 340000
java.sql.BatchUpdateException: Error(s) occurred while executing the
batch, see 
next SQLExceptions for details
        at
nl.cwi.monetdb.jdbc.MonetStatement.executeBatch(MonetStatement.java:2
06)
        at LoadData.loadOrder(LoadData.java:1072)
        at LoadData.main(LoadData.java:113)


>==17410== Thread 5:
==17410== Syscall param msync(start) points to uninitialised byte(s)
==17410==    at 0x6AF410B: (within /lib64/libpthread-2.9.so)
==17410==    by 0x5EC0BD7: MT_msync (gdk_posix.mx:1101)
==17410==    by 0x5DA70A2: GDKsave (gdk_storage.mx:274)
==17410==    by 0x5B55694: HEAPsave (gdk_heap.mx:383)
==17410==    by 0x5B54963: HEAPextend (gdk_heap.mx:157)
==17410==    by 0x5C20E68: strPut (gdk_atoms.mx:2075)
==17410==    by 0x5AF12FC: BATappend (gdk_batop.mx:371)
==17410==    by 0x18C1DFE3: tr_update_delta (bat_storage.mx:1189)
==17410==    by 0x18C1E572: update_table (bat_storage.mx:1277)
==17410==    by 0x18C11FD5: rollforward_update_table (store.mx:2299)
==17410==    by 0x18C10FB6: rollforward_changeset_updates (store.mx:1941)
==17410==    by 0x18C123B3: rollforward_update_schema (store.mx:2372)
==17410==    by 0x18C10FB6: rollforward_changeset_updates (store.mx:1941)
==17410==    by 0x18C12551: rollforward_trans (store.mx:2394)
==17410==    by 0x18C13669: sql_trans_commit (store.mx:2768)
==17410==    by 0x18BD693B: mvc_commit (sql_mvc.mx:443)
==17410==    by 0x18BAEC67: transactions (sql_trans.mx:65)
==17410==    by 0x18BBAC3E: semantic (sql_semantic.mx:1005)
==17410==    by 0x18BBAD89: output_semantic (sql_semantic.mx:1094)
==17410==    by 0x18B579F5: sql_symbol2stmt (sql.mx:1094)
==17410==  Address 0x1c254fe6 is 2,490,342 bytes inside a block of size
3,276,800 alloc'd
==17410==    at 0x5C17B16: GDKmmap (gdk_utils.mx:1266)
==17410==    by 0x5DA7668: GDKload (gdk_storage.mx:376)
==17410==    by 0x5B5550B: HEAPload (gdk_heap.mx:341)
==17410==    by 0x5B54D19: HEAPextend (gdk_heap.mx:199)
==17410==    by 0x5C20E68: strPut (gdk_atoms.mx:2075)
==17410==    by 0x5AF12FC: BATappend (gdk_batop.mx:371)
==17410==    by 0x18C1DFE3: tr_update_delta (bat_storage.mx:1189)
==17410==    by 0x18C1E572: update_table (bat_storage.mx:1277)
==17410==    by 0x18C11FD5: rollforward_update_table (store.mx:2299)
==17410==    by 0x18C10FB6: rollforward_changeset_updates (store.mx:1941)
==17410==    by 0x18C123B3: rollforward_update_schema (store.mx:2372)
==17410==    by 0x18C10FB6: rollforward_changeset_updates (store.mx:1941)
==17410==    by 0x18C12551: rollforward_trans (store.mx:2394)
==17410==    by 0x18C13669: sql_trans_commit (store.mx:2768)
==17410==    by 0x18BD693B: mvc_commit (sql_mvc.mx:443)
==17410==    by 0x18BAEC67: transactions (sql_trans.mx:65)
==17410==    by 0x18BBAC3E: semantic (sql_semantic.mx:1005)
==17410==    by 0x18BBAD89: output_semantic (sql_semantic.mx:1094)
==17410==    by 0x18B579F5: sql_symbol2stmt (sql.mx:1094)
==17410==    by 0x18B9323A: SQLparser (sql_scenario.mx:1013)
>==17410==
==17410== Thread 4:
==17410== Syscall param write(buf) points to uninitialised byte(s)
==17410==    at 0x6AF37FB: (within /lib64/libpthread-2.9.so)
==17410==    by 0x5DA716E: GDKsave (gdk_storage.mx:284)
==17410==    by 0x5B55694: HEAPsave (gdk_heap.mx:383)
==17410==    by 0x5DA93D9: BATsave (gdk_storage.mx:559)
==17410==    by 0x5B52DBD: BBPsync (gdk_bbp.mx:3133)
==17410==    by 0x5B3C78F: TMsubcommit (gdk_tm.mx:189)
==17410==    by 0x5EC467D: bm_subcommit (gdk_logger.mx:1011)
==17410==    by 0x5EC772B: bm_commit (gdk_logger.mx:1735)
==17410==    by 0x5EC3A24: logger_commit (gdk_logger.mx:949)
==17410==    by 0x5EC5405: logger_exit (gdk_logger.mx:1226)
==17410==    by 0x5EC551A: logger_restart (gdk_logger.mx:1253)
==17410==    by 0x18C1F193: bl_restart (bat_logger.mx:64)
==17410==    by 0x18C0F5AF: store_manager (store.mx:1435)
==17410==    by 0x18BD64D5: mvc_logmanager (sql_mvc.mx:342)
==17410==    by 0x5EC05E7: wrapper_routine (gdk_posix.mx:940)
==17410==    by 0x6AED006: start_thread (in /lib64/libpthread-2.9.so)
==17410==    by 0x7A1938C: clone (in /lib64/libc-2.9.so)
==17410==  Address 0x16295119 is 8,377 bytes inside a block of size 20,504
alloc'd
==17410==    at 0x4C2467E: malloc (in
/usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so)
==17410==    by 0x5C1710D: GDKmallocmax (gdk_utils.mx:1062)
==17410==    by 0x5C172AA: GDKmalloc (gdk_utils.mx:1085)
==17410==    by 0x5DA73EE: GDKload (gdk_storage.mx:330)
==17410==    by 0x5B5550B: HEAPload (gdk_heap.mx:341)
==17410==    by 0x5DA9AE0: BATload_intern (gdk_storage.mx:643)
==17410==    by 0x5B4B849: BBPdescriptor (gdk_bbp.mx:1864)
==17410==    by 0x18C1F812: BATdescriptor (gdk.h:1324)
==17410==    by 0x18C1F786: temp_descriptor (bat_utils.mx:123)
==17410==    by 0x18C1EC76: tr_snapshot_bat (bat_storage.mx:1405)
==17410==    by 0x18C1EDB8: snapshot_table (bat_storage.mx:1434)
==17410==    by 0x18C11EE5: rollforward_update_table (store.mx:2289)
==17410==    by 0x18C10FB6: rollforward_changeset_updates (store.mx:1941)
==17410==    by 0x18C123B3: rollforward_update_schema (store.mx:2372)
==17410==    by 0x18C10FB6: rollforward_changeset_updates (store.mx:1941)
==17410==    by 0x18C12551: rollforward_trans (store.mx:2394)
==17410==    by 0x18C135AE: sql_trans_commit (store.mx:2751)
==17410==    by 0x18BD693B: mvc_commit (sql_mvc.mx:443)
==17410==    by 0x18BAEC67: transactions (sql_trans.mx:65)
==17410==    by 0x18BBAC3E: semantic (sql_semantic.mx:1005)
>

----------------------------------------------------------------------

Comment By: Martin Kersten (mlkersten)
Date: 2009-02-16 19:34

Message:
Valgrind only reports about accessing non-initialized information here.
And this access to non-initialized buffer space is known and safe.

To detect and claim memory leaks in GDK you have to run
valgrind  --leak-check=full --show-reachable=yes --num-callers=20 mserver5
--dbfarm=$DBFARM --dbname=$DBNAME 

----------------------------------------------------------------------

Comment By: Fabian (mr-meltdown)
Date: 2009-02-16 19:31

Message:
I think someone else will still be in the neighbourhood of clueless with
your given instructions on how to reproduce your problem.

----------------------------------------------------------------------

Comment By: Stefan de Konink (skinkie)
Date: 2009-02-16 19:28

Message:
Who) Me
What) BenchmarkSQL on Java 1.5 fails
When) When running loadSQL code
Where) AMD64
Why) Because there is a memory problem in GDK
How) Running the loadSQL.sh program

I think all these questions were answered in the report. Because your
summery does not describe the section where the issue is present. Thus your
summary related to the actual knowledge that was gathered is incomplete.

----------------------------------------------------------------------

Comment By: Fabian (mr-meltdown)
Date: 2009-02-16 17:55

Message:
Please post enough information such that others can reproduce your problem.
 What makes you think this is a GDK issue?  Why not just "When batch
executing prepared statements, the server reports current transaction is
aborted after a random amount of statements with no previous error
message"?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2606020&group_id=56967

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to