Hi,
I'm trying to use regex for searching unicode text in Rev. The docs
say that Rev follows pcre rules and I've tried various combinations or
\N, \x but unsuccessfully. Can someone offer a way to do this or
confirm that Rev does not support regex for unicode?
I am working with utf 8 (Greek etc)
Hi Ron,
If you have your unicode data in a field, it is always Rev's only
flavour of unicode, which is mostly the same as UTF16. So, if you
convert your search string to UTF16 and escape the necessary
characters, you should be able to use regex on a field.
If you are doing a regex search
Ron,
I made a Japanese search field before. I don't know what the regex
is, but the search field works in Japanese and English on Rev3.0.
go stack url http://www.kenjikojima.com/runrev/handbook/download/JpSearchFld.rev
Try it,
--
Kenji Kojima
http://www.kenjikojima.com/
On 13 mrt
Hi Kenji, Mark,
Thanks for your suggestions.
Kenji, you are right about searching using the find command. I can
search for Kanji, as you do in your stack.
Mark, you are right about escaping the characters when I use regex.
I've have been escaping the chars and that seems to work in the case
Ron,
When you do a search, you're not searching for text. You're searching
for *binary data*. So, find out what the binary data is and make a
regex for that. If there are any characters with a special meaning,
escape them.
Also bear in mind that when doing matchChunk, you'll get the
Hi Ron,
Now I knew what the regex is. Japanese writing does not use separated
words. If your data is a kind of words list, you might use some parts
of this stack.
go stack url http://www.kenjikojima.com/runrev/handbook/download/JpnSortStudy.rev
see 美術用語.
--
Kenji Kojima
Thanks a million everyone! I'd never have found this stuff out without some
pointers, and these are exactly what I needed.
--
View this message in context:
http://www.nabble.com/a-regular-expression-question%2C-or-at-least-a-text-manipulation-question-tp19189206p19214253.html
Sent from
How do you do the following?
I have a series of lines which go like this
| [record separator, new record starts]
AAA consectetur adipisicing elit, sed
BBB lorem ipsum
CCC consectetur adipisicing elit, sed
CCC laboris nisi ut aliquip ex ea
DDD ut aliquip ex ea commodo
| [record separator]
AAA
Peter Alcibiades wrote:
How do you do the following?
I have a series of lines which go like this
| [record separator, new record starts]
AAA consectetur adipisicing elit, sed
BBB lorem ipsum
CCC consectetur adipisicing elit, sed
CCC laboris nisi ut aliquip ex ea
DDD ut aliquip ex ea commodo
|
I may be solving the wrong problem for you but see if this works
The prefix will always be word 1 of each line
You do not need case-sensitive
the delimiter is a tab
- start copy
on test
put the clipboarddata into incomingList
filter incomingList without empty
repeat for each
Just thinking, and here is a bit more compact code to do the job:
It should run about the same speed.
repeat for each line LNN in incomingList
put word 1 of LNN cr after prefixList
end repeat
split prefixList using cr and tab
put the keys of prefixList into prefixList
could
--- Peter Alcibiades [EMAIL PROTECTED]
wrote:
How do you do the following?
I have a series of lines which go like this
| [record separator, new record starts]
AAA consectetur adipisicing elit, sed
BBB lorem ipsum
CCC consectetur adipisicing elit, sed
CCC laboris nisi ut aliquip ex ea
I need to find the last matching character offset of a string withing
another string. The offset function only gives me the first one. Is
there a function I'm missing (like offset from end of string) or a one
or two liner I can use?
Example:
approved_by_code - I'm looking for the offset to
Len try this in the message box:
put approved_by_code into tFind
get matchText(tFind, ^(.*?)[^_]*$, p1)
put len(p1)
that outputs 12. And with a string that contains no _ it outputs 0
--gordy
On Jul 10, 2007, at 09:55, Len Morgan wrote:
I need to find the last matching character offset of a
On Tue, 10 Jul 2007 09:55:18 -0500, Len Morgan wrote:
I need to find the last matching character offset of a string withing
another string. The offset function only gives me the first one. Is
there a function I'm missing (like offset from end of string) or a
one or two liner I can use?
On Tue, 10 Jul 2007 10:20:10 -0500, Ken Ray wrote:
You'd get 9,9 in the message box - the first hit. So you need to
reverse the greediness of the match using the (?U) directive. This
one works to find the last occurrence:
on mouseUp
if matchChunk((?U)approved_by_code,(_),tStart,tEnd)
Well that brings up an interesting point. When I refer to a chunk
line x in my example, Revolution does not include the paragraph
delimiter, but in Devin's it does. How odd then that the
interpretation of what is meant by line is modified by how you
compare it with something else. Far be
On Aug 10, 2006, at 10:56 AM, Robert Sneidar wrote:
Well that brings up an interesting point. When I refer to a chunk
line x in my example, Revolution does not include the paragraph
delimiter, but in Devin's it does. How odd then that the
interpretation of what is meant by line is
Actually after thinking about this, it is not ambiguous at all. In
both examples line x means the line without a cr. But in Devin's
example, he is comparing against each line in the container, whereas
I was I was simply checking for the existance of line x ANYWHERE in
the container. I
Hi,
I have a list like:
1
1
1
2
2
3
4
4
4
And I need all double lines removed from this list so it becomes:
1
2
3
4
Anyone a suggestion? I am struggling through the regular expressions,
Not a regEx but works:
function killDuplicateLines tList
put empty into prevL
repeat for each line L in tList
if prevL is not L then put L cr after newList
put L into prevL
end repeat
return newList
end killDuplicateLines
Best,
Mark
On 9 Aug 2006, at 14:36, Ton Kuypers
Hi mark,
This works indeed very fine, thanks.
I was just trying to use the regEx for speed reasons, very strange I
can't get it to work...
Warm regards,
Ton Kuypers
Digital Media Partners bvba
Tel. +32 (0)477 / 739 530
Fax +32 (0)14 / 71 03 04
http://www.dmp-int.com
On 9-aug-06, at
On 8/9/06 6:36 AM, Ton Kuypers [EMAIL PROTECTED] wrote:
This will do the same thing.
The idea is to use Rev's array features such that keys are automatically
unique and can be as long as you wish.
This means that you could remove duplicate lines of any length.
get listOfAnything
filter it
Ton Kuypers wrote:
I was just trying to use the regEx for speed reasons...
RegEx is highly generalized, to the point that in each case I've
benchmarked here I was able to come up with a faster solution using
chunk expressions.
--
Richard Gaskin
Fourth World Media Corporation
This method is faster - but it doesn't do exactly the same thing. If
the idea is simply to have unique values in each line, then this is
the way to go. The method I suggested simply removes repeating lines,
and would be more suitable if one were trying to record changes in a
stream of
put yourdatahere into moldlist
put into mnewlist
repeat for each line theLine of moldlist
if line theLine is in mnewlist then
next repeat
else
put line theLine of moldlist return after mnewlist
end repeat
Bob Sneidar
IT Manager
Logos Management
Calvary Chapel CM
On Aug 9, 2006, at 9:25
Having been caught by things like this before, I would suggest a
small modification to Bob's script:
On Aug 9, 2006, at 12:21 PM, Robert Sneidar wrote:
put yourdatahere into moldlist
put into mnewlist
repeat for each line theLine of moldlist
if line theLine is among the lines of mnewlist
27 matches
Mail list logo