On Thursday, September 30, 2004, at 12:35 PM, Jan Schenkel wrote:

If my memory serves me well, Geoff Canyon started a
thread on the xTalk mailing list a while ago that
proposed functions itemOffsets, wordOffsets and
lineOffsets which would return all the occurences'
locations.

So if we could have an elementOffsets() function, this
would be the best solution for the above request, I
think.

Jan Schenkel.

I have been talking to people at the mothership about this.

First off my software that I created with Rev is starting to sell. This gives me the money to pay for the externals suggested below by Mark. The reason that I bring it up here on the list in the open is that the external will speed up my XML based database and if it where later to be added to the engine then it would speed up my software that I'm selling now. This sounds like it might be a great tool for array power if you are willing to use a parser for the manipulations.

Before I proceed does this suggestion sound good for this array thread? (see below) Pull-parsing an XML structure at high speed could give us all kinds of array manipulations if you where to use numbered tag sets like <1>[data]</1>, <2>[more data here]</2>, <3>[even more data]</3>etc... and <1,1> <1,2> and <1,3,1> for dimensional arrays.

Mark Brownell


On Wednesday, September 29, 2004, at 09:50 AM, Mark Waddingham wrote:

Hi Mark,

[snip]
In terms of your request for the suggested matchGlobal function [see below] then while
it would be nice to have, in comparison with other feature requests that
we have, it is difficult to justify putting development time into this as
opposed to other extensions/enhancements and features that people have
requested.


However, as I mentioned before, we would be perfectly willing to develop
an external with the functionality you require which can then be
integrated into the engine at the next opportunity. This both mitigates
the development cost to us, and provides you a more flexible solution
should you require specialization and/or optimization of the functions in
the future.


If you are interested in proceeding in this manner then I will happily put
together a more concrete proposal to you, including technical details
and time costings, and leave you to negotiate with Kevin the costs and
finer contractual details.


To give you an idea of the substance of such a proposal I would suggest
implementing an external with the following functions:

  matchOffsets(<needle>, <haystack>, [ <from> ], [ <to> ])
  - return a list of offsets of the <needle> in char <from> to <to> of
    <haystack> one per line.

matchParallelOffsets(<needles>, <needle_sep>, <haystack>, [ <from> ], [ <to> ])
- return a list of offsets of each chunk of <needles> in char <from> to
<to> of <haystack>
The chunks of <needles> would be delimited by the character <needle_sep>.
Each line of this list would be of the form
offset of <needle_1>, offset of <needle_2>, ...,, offset of <needle_n>
(i.e. the functionality of your parser would be given by doing a single
call of matchParallelOffsets with two chunks in the <needles>)


  matchSetCacheSize <size>
  - The Boyer-Moore algorithm has a set-up cost for each pattern which
    incurs a memory overhead. This call would set the maximum number of
    patterns that should be cached at any one time.

To give an idea about how these might be implemented in the engine, then
Jeanne's suggestion for syntax is a good one (assuming it doesn't cause
any conflicts - I make no promises as to whether this syntax is feasible):


  the offsets of <needle> in <haystack>
  the offsets of the lines/words/items of <needles> in <haystack>

Anyway, I shall leave you to think on this way forward, and I promise to
be more efficient in getting back to you next time.


Warmest Regards,

Mark.

On Thu, 16 Sep 2004, Mark Brownell wrote:

Hi Mark,

I was wondering, now that things might have gotten a little less
hectic, what or if any progress has been made on adding this to the Rev
engine? This is exactly what I was hoping to get. I can use it to
isolate large portions of huge documents for the purpose of creating
something I might need very badly in the next few months. also this
single function could be highly useful to others as you pointed out.


Thanks,

Mark Brownell

On Wednesday, August 18, 2004, at 03:10 AM, Mark Waddingham wrote:

The one of most interest is the Boyer-Moore algorithm as this is
reputed
to be the fastest.

So, one idea is to implement a function:
  matchGlobal(stringToSearch, token)
returning a list of all indices in stringToSearch of token.

e.g.
  get matchGlobal("<a>foo</a><a>bar</a><a>baz</a>", "<a>")
would give
  it[1] = 1
  it[2] = 10
  it[3] = 20


_______________________________________________ use-revolution mailing list [EMAIL PROTECTED] http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to