was just reading that EXT3 does HTREE as its more efficient than a BTREE, jdbm has an HTREE implementation.

MArk
Edson Tirelli wrote:
  Mark,

My feeling is that once someone is using an "ordering" constraint (>, >=, <, <=), it means the attribute will have several possible values (not a discrete small set of possible values). So, the issue is not really the overhead of a Tree x Hashmap, but if the overhead of a tree pays off the overhead of not using indexing for cases like that. My feeling is that it does pays off and is worth at least a try. I'm saying this because what I think you are refering to when you say a hashmap+training-data approach (or hints as suggested by Michael) will not address the general case.
  I mean, when you have a rule like:

when
  A( $var : attr )
  B( attr2 > $var )
then

Using a hashmap will lead to a buckets count explosion that may cause the hashmap to perform worse than the tree. Also, realize that in the above case, a naive approach will cause a single B fact to be added to multiple buckets, since it may match more than one A with different "attr" values. So, thats why I think a tree is better for the general case of "ordering" constraints. Specific cases may be handled in a different way.

I know htrees only by name. Any specific feature you think may be useful for us?

   []s
   Edson



Mark Proctor wrote:

I wonder if the overhead of a btree, btw have you seen htrees, compared to hashmap is worth while, I expect it is as join ordering can make a big difference.

Mark

Edson Tirelli wrote:

  Mark,

The approach I was looking into that time was not feasible because we used to keep order of asserted fact/tuples on a node basis. With the core changes you made in 3.1, we can implement range ordering by replacing the current hashmap index for a tree index. No need for training data, in my understanding. Maybe we will need a composed index approach to work some cases, but the general solution idea is simple.

  []s
  Edson

Mark Proctor wrote:

Actually I was just thinking about some stuff Edson has done. With solvers we know the available data and ranges, right? We can use this to order indexes, I know this was something Edson looked into - but without training data, we couldn't make it worth while - same for custom indexing. So we can start to incorporate those to get faster joins for known data sets.

Mark
Geoffrey De Smet wrote:

The more I learn from JCHS (or prolog for that matter),
the more I am starting to think that this is a different way of solving.

1) JCHS/prolog looks like (or is) declarative solving.

2) Taseree is actually more hybrid, the general idea behind it is:
- Drools (declarative programming) is very easy for evaluation
but very difficult for solving.
- Local/tabu search (procedural programming) is easy for solving
but difficult for evaluation.


Both have it's disadvantages and advantages, for example:
Local search is generally faster but doesn't recognize the optimal solution.

To me it seems they are both interesting to implement,
there must be some common ground too.
We should hold a conference call about it this weekend?

It would be a good idea to compare JCHS and Taseree on a couple of problems, like the tt problem:
http://mat.gsia.cmu.edu/TOURN/



---------------------------------------------------------------------
To unsubscribe from this list please visit:

   http://xircles.codehaus.org/manage_email






---------------------------------------------------------------------
To unsubscribe from this list please visit:

   http://xircles.codehaus.org/manage_email






---------------------------------------------------------------------
To unsubscribe from this list please visit:

   http://xircles.codehaus.org/manage_email

Reply via email to