Hi,
My main concern is total space consumption in RAM. I don't care about
allocation efficiency, since building this tree is a one-time offline
task, and the tree won't change once it is built.
The tree structure I want to build is called an ADTree. It is a
multi-level hash for storing counts with space-saving implicit
representation of the most frequent attribute-value pair at each level
of the tree. The structure is very useful for data mining or machine
learning. I will be using it for a constraint programming problem I'm
working on.
See http://www.jair.org/abstracts/moore98a.html for a description.
I need to play with the structure to find a way to deal with high-arity
attributes. It would be much more pleasant to do this in Mozart, but I
want/need to keep my overall memory consumption under 2G for my
particular dataset. If an equivalent structure in Mozart consumed 150%
(or less) the space of the C++ version, it would be worth it to me to
recode it in Mozart. But I'd like to be know before I invest the effort!
Thanks,
Irene
---------------------------------------------------
Date: Sun, 25 Sep 2005 21:32:11 +0200
From: Filip Konvicka <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>>
Subject: Re: memory efficiency of records
To: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi,
if your only concern is memory usage, then stick to C/C++. Mozart itself
is written in C++, so your problem can be solved in C++ at least as well
as in Mozart :-) If you use std C++ containers for data storage, take a
look at www.boost.org <http://www.boost.org> for a couple of useful
template classes that help
with memory fragmentation (e.g. the Pool library). If you post (or send
me) your C++ program, I can perhaps give you some tips.
Cheers,
Filip
-------------------------------------
Date: Sat, 24 Sep 2005 19:30:48 -0600
From: Irene Langkilde-Geary <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>>
Subject: memory efficiency of records
To: [email protected] <mailto:[email protected]>
Message-ID: <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>>
Content-Type: text/plain; charset="iso-8859-1"
How memory efficient are Mozart records? I'm using a C++ program that takes
a 250M data file as input and builds a tree structure in C++ that occupies
over 1G RAM when complete. Would it be even close to the same size if I
recoded the tree-builder in Mozart, or should I stick to C++?
Thanks,
Irene
_________________________________________________________________________________
mozart-users mailing list
[email protected]
http://www.mozart-oz.org/mailman/listinfo/mozart-users