Bugs item #1811229, was opened at 2007-10-11 02:36
Message generated for change (Comment added) made by boncz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1811229&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: XML
Group: XML
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Stefan de Konink (skinkie)
Assigned to: Peter Boncz (boncz)
Summary: [ADT] Adding large document, with update support

Initial Comment:
A fairly large document (20GB) inserting with 5% slack space results in a 
partial insert (without errors). But when querying anything for this document 
it gets the following error:

> xquery>
> more> fn:count(doc("planet.osm")//*)
> more>MAPI  = mone...@localhost:50000
> QUERY =  fn:count(doc("planet.osm")//*)
> ERROR = !ERROR: [remap]: 5 times inserted nil due to errors at tuples
> 1...@0, 2...@0, 3...@0, 4...@0, 5...@0.
>         !ERROR: [remap]: first error was:
>         !ERROR: CMDremap: operation failed.
>         !ERROR: interpret_unpin: [remap] bat=492,stamp=-729 OVERWRITTEN
>         !ERROR: BBPdecref: 1000000001_rid_nid does not have pointer fixes.
>         !ERROR: interpret_params: leftfetchjoin(param 2): evaluation error.

Inserting the same document in read only mode works. Enough disk space is 
available. I'm using the -SR3 release.


The document itself can be obtained from an OpenStreetMap mirror.

----------------------------------------------------------------------

>Comment By: Peter Boncz (boncz)
Date: 2009-04-07 16:29

Message:
performance will not be fixed further; the bug was fixed already


----------------------------------------------------------------------

Comment By: Peter Boncz (boncz)
Date: 2007-10-21 22:12

Message:
Logged In: YES 
user_id=591107
Originator: NO

Hi Stefan, et al,

summary: the new release of XQuery (0.20) will fix the remap failing
problem. However, until further fixes are made, performance problems on
very large datasets will persist. Work continues to be underway on this.

The remap failing was just a bug.

The simple diagnosis why it remains slow is that with 8GB of memory and a
20GB document, there is too little to keep everything in RAM. My rule of
thumb is that the file-size of all documents you are querying
simultanuously should not be larger than your RAM size. You are violating
that rule.

What is worse, on updatable documents, the index is created at "first use"
and is *not* persistent, hence each time the database is restarted; the
first query will take long, because it constructs the index. As you
observed, constructing the index may take very long (>6 hours). It may help
using a 64-bits OS but 32bit-oid binary. I have looked at your datasize and
"it fits" within the constraints of 32bit-oids, which means that wrt
64bit-oids you win 50% memory usage for them (but of course, not nearly all
data is oid-data). Given your extreme shortage of memory, saving space this
way is a good idea after all.

I have looked into the question why under memory shortage index
construction is so slow (>6 hours) and why also with the index available,
the query remains slow (multiple minutes). This resulted in a patch that I
attach to this bugreport, that you may test. Main thing is that I switched
back to ordered indices from an unorderd index with a hash table. Under
swapping, the hash table keeps getting lost, and also with swapping, sorted
selections tend to be faster than (random access) hash selections.

The patch is applied by substituting the pathfinder.mil file in your
installation by the attached file (decompress it first). NOTE: this
pathfinder.mil *only* matches 0.20 installations, thus the new release!!

On a machine with 15GB memory, I can now create the index in 1 hour.
Shredding is 25 minutes. Queries in 15 seconds. Whether it will work on
your 8GB machine is questionable, but you may give it try.

As a general note, your problem triggered me to review the behavior of the
(just added) value indices under memory pressure. The situation that a
first query will necessarily take 1.5 hours is certainly undesirable. Sure,
on data sizes smaller than 20GB, it is faster, but still sluggisch. I am
now considering making the indices persistent for updatable documents as
well (they already are for readonly ones), but doing so implies that
changes to the index in its so-called delta structures will need to be
logged during a commit, which is not yet implemented.

This is the main reason why the attached patch is not yet final, it is
probable that I will switch to persistent indices if I find time to hack it
in... to be continued
File Added: pathfinder.zip

----------------------------------------------------------------------

Comment By: Martin Kersten (mlkersten)
Date: 2007-10-21 20:17

Message:
Logged In: YES 
user_id=490798
Originator: NO

Please include the comments made by SM and your assessment,
planning etc.

----------------------------------------------------------------------

Comment By: Peter Boncz (boncz)
Date: 2007-10-15 12:10

Message:
Logged In: YES 
user_id=591107
Originator: NO

on request, Stfan provided the following detailed info about this
problem:

PLATFORM INFO
=============

Monet Database Server V4.18.2
Compiled for x86_64-unknown-linux-gnu/64bit with 64bit OIDs;
Gentoo, Linux, xen01 2.6.20-xen-r3 #3 SMP Sun Oct 7 05:22:20 CEST 2007
x86_64 Intel(R) Xeon(R) CPU L5320 @ 1.86GHz GenuineIntel GNU/Linux


REPRODUCTION
============

- - Download http://mirror.openstreetmap.nl/planet/planet-071003.osm.bz2
- - Decompress the document.
- - Start the MonetDB4 server with Pathfinder module.
- - Add this document to the database server with 5 procent space for
updates.
- - Request any operation to this document will result in the failure
reported.

This error doesn't happen if the document was added without the ability
to update the document.

Also for you my current ls()

MonetDB>ls();
#-----------------------------------------------------------------------------------------------------------------------------------------#
# name                          htype   ttype           count
heat    dirty           status  kind    refcnt  lrefcnt           # name
# str                           str     str             lng
int     str             str     str     int     int               # type
#-----------------------------------------------------------------------------------------------------------------------------------------#
[ "1000000000_attr_own",          "void", "oid",          747750437,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_attr_prop",         "void", "oid",          747750437,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_attr_qn",           "void", "oid",          747750437,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_frag_root",         "oid",  "oid",          1,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000000_map_pid",           "void", "void",         42822,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_nid_rid",           "void", "void",         701587412,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_prop_com",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_prop_ins",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_prop_text",         "void", "str",          3,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_prop_tgt",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_prop_val",          "void", "str",          118077679,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_qn_histogram",      "void", "lng",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_qn_loc",            "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_qn_nid",            "oid",  "oid",          271876931,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000000_qn_prefix",         "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_qn_prefix_uri_loc", "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_qn_uri",            "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_qn_uri_loc",        "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_rid_kind",          "void", "chr",          701587412,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_rid_level",         "void", "chr",          701587412,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_rid_nid",           "void", "void",         701587412,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_rid_prop",          "void", "oid",          701587412,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_rid_size",          "void", "int",          701587412,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000000_vx_hsh_nid",        "int",  "oid",          402481184,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000001_attr_own",          "void", "oid",          754314985,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_attr_prop",         "void", "oid",          754314985,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_attr_qn",           "void", "oid",          754314985,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_frag_root",         "oid",  "oid",          1,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000001_map_pid",           "void", "oid",          45450,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_nid_rid",           "void", "oid",          707424150,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_prop_com",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_prop_ins",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_prop_text",         "void", "str",          3,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_prop_tgt",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_prop_val",          "void", "str",          119784146,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_qn_histogram",      "void", "lng",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_qn_loc",            "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_qn_prefix",         "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_qn_prefix_uri_loc", "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_qn_uri",            "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_qn_uri_loc",        "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_rid_kind",          "void", "chr",          744652800,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_rid_level",         "void", "chr",          744652800,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_rid_nid",           "void", "oid",          744652800,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_rid_prop",          "void", "oid",          744652800,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000001_rid_size",          "void", "int",          744652800,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_attr_own",          "void", "oid",          754314985,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_attr_prop",         "void", "oid",          754314985,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_attr_qn",           "void", "oid",          754314985,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_frag_root",         "oid",  "oid",          1,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000002_map_pid",           "void", "void",         43178,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_nid_rid",           "void", "void",         707424150,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_prop_com",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_prop_ins",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_prop_text",         "void", "str",          3,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_prop_tgt",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_prop_val",          "void", "str",          119784146,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_qn_histogram",      "void", "lng",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_qn_loc",            "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_qn_nid",            "oid",  "oid",          273809205,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000002_qn_prefix",         "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_qn_prefix_uri_loc", "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_qn_uri",            "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_qn_uri_loc",        "void", "str",          19,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_rid_kind",          "void", "chr",          707424150,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_rid_level",         "void", "chr",          707424150,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_rid_nid",           "void", "void",         707424150,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_rid_prop",          "void", "oid",          707424150,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_rid_size",          "void", "int",          707424150,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000002_vx_hsh_nid",        "int",  "oid",          407415839,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000003_attr_own",          "void", "oid",          5870236,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_attr_prop",         "void", "oid",          5870236,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_attr_qn",           "void", "oid",          5870236,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_frag_root",         "oid",  "oid",          1,
  0,      "clean",        "disk", "pers", 0,      1               ]
[ "1000000003_map_pid",           "void", "oid",          351,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_nid_rid",           "void", "oid",          5455263,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_prop_com",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_prop_ins",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_prop_text",         "void", "str",          3,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_prop_tgt",          "void", "str",          0,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_prop_val",          "void", "str",          1814631,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_qn_histogram",      "void", "lng",          16,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_qn_loc",            "void", "str",          16,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_qn_prefix",         "void", "str",          16,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_qn_prefix_uri_loc", "void", "str",          16,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_qn_uri",            "void", "str",          16,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_qn_uri_loc",        "void", "str",          16,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_rid_kind",          "void", "chr",          5750784,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_rid_level",         "void", "chr",          5750784,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_rid_nid",           "void", "oid",          5750784,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_rid_prop",          "void", "oid",          5750784,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "1000000003_rid_size",          "void", "int",          5750784,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "collection_name",              "oid",  "str",          4,
  0,      "clean",        "load", "pers", 0,      2               ]
[ "collection_size",              "oid",  "lng",          4,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "doc_collection",               "oid",  "oid",          6,
  0,      "clean",        "load", "pers", 0,      2               ]
[ "doc_location",                 "oid",  "str",          4,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "doc_name",                     "oid",  "str",          4,
  0,      "clean",        "load", "pers", 0,      2               ]
[ "doc_timestamp",                "oid",  "timestamp",    4,
  0,      "clean",        "disk", "pers", 0,      2               ]
[ "uri_lifetime",                 "str",  "lng",          1,
  0,      "clean",        "load", "pers", 0,      2               ]
[ "xquery_catalog",               "int",  "str",          86,
  0,      "clean",        "load", "pers", 1,      1               ]
[ "xquery_seqs",                  "int",  "lng",          1,
  0,      "clean",        "load", "pers", 1,      2               ]
[ "xquery_snapshots",             "int",  "int",          0,
  0,      "clean",        "load", "pers", 1,      2               ]



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1811229&group_id=56967

------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to