Till,

On further thought, I think I have a better solution -- one that is more general and does not require any compiler support.

Add a method to the iterator of the form

int skip(int len);

The contract would be that skip tries to skip len items and returns len - n as the return value where n is the number of items it could actually skip (Can be < len, if it reached end-of-sequence).

The default implementation could be:

while(len-- > 0 && in.next() != null);
return len + 1;

The iterator of the Sequence could do this in a smarter way.

The second part is the use of skip in subsequence.

This will give us the speed improvement for the query.

What do you think?

Vinayak

Vinayak Borkar wrote:
In this specific case, the sequence is materialized anyways since its assigned to a global variable and then used in a loop. If we add a subsequence() method on XDMSequenece, that creates the subsequence in constant time.

The compiler could replace the subsequence function with the subsequence-of-materialized-sequence function when it notices the materialization condition.

Vinayak

Till Westmann wrote:
How would you optimize this?
Materialize the sequence and index it?
Or evaluate multiple expressions on the sequence during one scan?
Or ...?

Till

On Jan 5, 2010, at 12:55 AM, Till Westmann wrote:

Hmm, is this test part of a later revision of the test suite? For me test execution never was a real problem ...

How slow is it?

Thanks,
Till

On Jan 4, 2010, at 9:50 PM, Vinayak Borkar wrote:

Its not an infinite loop -- Expressions/Construct/DirectConElem/DirectConElemContent//Constr-cont-document-3.xq

is taking a long time to run. I am pasting the query below.

The query iterates over about 1.1 M integers 70 at a time and calls subsequence() for the 70 items in the window. This is a quadratic operation in the engine -- and takes a long time.

Should we optimize this case?


Thanks,
Vinayak


--- Query begin ---


declare variable $codepoints as xs:integer+ := (9, (: 0x9 :)
                                              10,(: 0xA :)
                                              13,(: 0xD :)
32 to 55295, (: 0x20 - 0xD7FF :) 57344 to 65532, (: 0xE000 - 0xFFFD :) 65536 to 1114111 (: 0x10000 - 0x10FFFF :));
declare variable $count as xs:integer := count($codepoints);
declare variable $lineWidth as xs:integer := 70;

<allCodepoints>
  <!-- Each <r>-element represents a codepoint range. The 's' attribute
       is the start codepoint, the 'e' attribute is the end codepoint.
Note that these are only *Hints*, since the character range is not contiguous.
    -->
{
  "&#xA;",
  "&#xA;",
(: The outputted file is rather big, so to make it managable, we output
     a chunk of $lineWidth characters in each element.
   :)
  for $i in (1 to $count idiv $lineWidth)
  let $startOffset := (($i - 1) * $lineWidth) + 1
  return (<r s="{$codepoints[$startOffset]}"
             e="{$codepoints[$startOffset] + $lineWidth}">
              {
codepoints-to-string(subsequence($codepoints, $startOffset, $lineWidth))
              }
         </r>, "&#xA;")
}
</allCodepoints>

--- Query end ---

Vinayak Borkar wrote:
Till,
Does XTest on the test suite terminate? Some query is throwing the engine into an infinite loop. Do you see this?
Thanks,
Vinayak





Reply via email to