New topic: 

Shunting Yard Algorithm

<http://forums.realsoftware.com/viewtopic.php?t=47238>

         Page 1 of 2
   [ 25 posts ]   Go to page 1, 2  Next                Previous topic | Next 
topic          Author  Message        DaveS          Post subject: Shunting 
Yard AlgorithmPosted: Sun Mar 10, 2013 12:51 am                                 
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                Does anyone have a RealStudio 
implemetaion of the "Switching Yard" Algorithm..... I need a version that can 
handle FUNCTIONS as well as simple add/sub/mult/div....  Also variable names... 
simple as well as arrays... the arrays can be designated as functions to 
simplfy things.

Basically I need to be able to turn a text string such as 

s="SIN(X)+2*(X/(3+7)"
into an array of tokens arranged in a RPN order so that they can then be used 
to "solve" the equation.      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind      

    Last edited by DaveS on Sun Mar 10, 2013 9:21 am, edited 1 time in total.   
                             Top                ktekinay          Post subject: 
Re: Switching Yard AlgorithmPosted: Sun Mar 10, 2013 1:13 am                    
             
Joined: Mon Feb 05, 2007 5:21 pm
Posts: 520
Location: New York, NY                I have no idea what the "switching yard 
algorithm" is, but I do have an Evaluate class that uses RBScript to safely 
evaluate a string expression like the one you posted in your example. Would 
that help you?      
_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.
  
                             Top                doofus          Post subject: 
Re: Switching Yard AlgorithmPosted: Sun Mar 10, 2013 2:15 am                    
             
Joined: Thu Sep 10, 2009 2:50 am
Posts: 374
Location: Santa Cruz, CA, USA                Charcoal design has a good 
tokenizer, it's event based but easy to add returning a token array
http://www.charcoaldesign.co.uk/pages/source.realbasic.php

And wikipedia has a good outline for processing those tokens
http://en.wikipedia.org/wiki/Shunting-yard_algorithm   
                             Top                DaveS          Post subject: 
Re: Switching Yard AlgorithmPosted: Sun Mar 10, 2013 8:56 am                    
             
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                Sorry... SHUNTING YARD 

I have found dozens of articles.... but all of them evaluate a string 
consisting solely of numbers, parends and simple operators (+/-*) .... 

I need to include single operator functions (SIN(x)), as well as double 
operator functions (LEFT(s,x)),  and variables (NOT REALSTUDIO variables, but 
to be looked up in a dictionary).   So the solution cannot rely on the lexical 
functions that RealStudio uses internally for their own compiler.

I expect (and require) that I will have dozens of functions that will perform 
the actual mathematical calculations.  All I need is a way to convert the 
infix/prefix syntax to a RPN stack based one.

Simple example.  2+3 becomes [+] [3] [2]  so when the values are popped off the 
stack.. you get two numbers, then the operator [+] which indicates to add the 
two numbers popped off and push the result leaving [5] on the stack.  Needless 
to say UNARY negative numbers need to be dealt with and not confused with the 
subtraction of two values.      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind  
                             Top                DaveS          Post subject: 
Re: Switching Yard AlgorithmPosted: Sun Mar 10, 2013 9:07 am                    
             
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                doofus wrote:Charcoal design has a good 
tokenizer, it's event based but easy to add returning a token array
http://www.charcoaldesign.co.uk/pages/source.realbasic.php

Looked good.... .  until it was unable to even handle a simple  equation like 
3*5  

thanks      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind  
                             Top                ktekinay          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 9:41 am                     
            
Joined: Mon Feb 05, 2007 5:21 pm
Posts: 520
Location: New York, NY                As for the variables, why not pre-process 
the expression and replace the variables with their values before tokenizing? I 
think a regular expression like "\b<variablename>\b" in a loop would do the job 
nicely.

So, off the top of my head:
dim rx as new RegEx
rx.Options.ReplaceAllMatches = true
dim variableNames() as variant = varDict.Keys
for each n as variant in variableNames
  // Assume variable names cannot contain "\"
  // Otherwise:
  // n = n.ReplaceAll( "\", "\\" )
  rx.SearchPattern = "\b" + n + "\b"
  rx.ReplacementPattern = varDict.Value( n )
  expression = rx.Replace( expression )
next

Assuming your dictionary is complete, where a = 3, this would turn an 
expression like "a+4" into "3+4", and then your tokenizer could assume that any 
letters it encounters are part of function names. (And a regular expression 
could help identify those too.)      
_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.
  
                             Top                DaveS          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 9:52 am                     
            
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                I already have progressed to the point 
where I have a STRING array with each element consisting of a "token" from the 
supplied expression. and have in fact done what you mentioned (just not the 
"way" you suggested)... and have replaced the tokens for SCALAR variables with 
their current values.... So where A=3, the expression  a+4 becomes 3+4

What I am struggling with is ARRAYS,  because their arguments could be 
expressions...... and its just a matter of identifiying which tokens are those 
arguments and calling the parser recursivly...  Which is where I was 
considering, convserion to RPN stack, and evaluating THOSE values when they 
came to the top of the stack.

x=A(z,9+q)/(3*sin(q))

I also have a unique sitiuation where I can have multiple variables that in 
fact have the same root name

C,  C(3),  C(3,3)  names can be the same as long as the signatures are 
different [kind like overloading]

This is a level of flexiablity that I must maintain to be compatible with the 
source that I am simulating.      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind  
                             Top                ktekinay          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 10:39 am                    
             
Joined: Mon Feb 05, 2007 5:21 pm
Posts: 520
Location: New York, NY                I can see the complexity that arrays can 
add, especially if they can have the same root name as a scalar. May we can 
make a number of assumptions, both functional and convenient?

Spaces and other white space can be removed from the expression. Variable and 
function names can only be combinations of letters and numbers.Case doesn't 
matter.Arrays can only have a limited number of indexes. (a(x) or a(x,y), 
possibly a(x,y,z), but not a(x,y,z,zz))You maintain a separate list of 
functions.You maintain separate lists of arrays and scalar variables.You 
maintain separate list of arrays based on the number of indexes.Groups enclosed 
by parens can be evaluated independently as if they were the entire expression. 
The exception to the above is where the parens are enclosing the indexes of a 
multi-dimensional array. In that case, the values between the commas can be 
evaluated as if they were the entire expression.An array variable name is 
always immediately followed by an open paren. Conversely, a scalar will be 
followed by anything but an open paren.You don't actually need the RPN version 
of the expression, you just want to evaluate it properly and thought that would 
be a good way to proceed.
If these are all true, I think it makes the problem much easier. Have I gone 
wrong somewhere?      
_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.
  
                             Top                DaveS          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 10:41 am                    
             
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                ktekinay wrote:I can see the complexity 
that arrays can add, especially if they can have the same root name as a 
scalar. May we can make a number of assumptions, both functional and convenient?

Spaces and other white space can be removed from the expression. Variable and 
function names can only be combinations of letters and numbers.Case doesn't 
matter.Arrays can only have a limited number of indexes. (a(x) or a(x,y), 
possibly a(x,y,z), but not a(x,y,z,zz))You maintain a separate list of 
functions.You maintain separate lists of arrays and scalar variables.You 
maintain separate list of arrays based on the number of indexes.Groups enclosed 
by parens can be evaluated independently as if they were the entire expression. 
The exception to the above is where the parens are enclosing the indexes of a 
multi-dimensional array. In that case, the values between the commas can be 
evaluated as if they were the entire expression.An array variable name is 
always immediately followed by an open paren. Conversely, a scalar will be 
followed by anything but an open paren.You don't actually need the RPN version 
of the expression, you just want to evaluate it properly and thought that would 
be a good way to proceed.
If these are all true, I think it makes the problem much easier. Have I gone 
wrong somewhere?


ALL of those are TRUE statements. with a caveat on the first on.  Spaces NOT 
enclosed in DOUBLE QUOTES (ie a String)      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind  
                             Top                DaveS          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 10:51 am                    
             
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                If it helps 
My "variables" are stored in a dynamic array (might change later to colleciton 
or dictionary) as custom classes

Constructor :  Name as String,  ArraySizeX as integer, ArraySizeY as integer  
(both default to 0)
if either Size argument is >0 then it is a ONE or TWO dimensional ARRAY

GET - Return Value of variable (this is overloaded to handle SCALAR, ONE, TWO 
dim'd arrays)

isARRAY returns a boolean if variable is an array or not

DataType returns an indicator that it is INTEGER, SINGLE, DOUBLE or STRING

Name returns uh.... the name 

num_dim returns # of dimensions  0, 1 or 2

Put  to store a value [overloaded just like GET]      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind  
                             Top                ktekinay          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 11:26 am                    
             
Joined: Mon Feb 05, 2007 5:21 pm
Posts: 520
Location: New York, NY                Man, you read fast. 

I still think replacing the scalars is a good way to start, and you can use my 
code above, just with this SearchPattern instead:
rx.SearchPattern = "\b" + n + "\b(?!\))"
Where the variable is "a", this would replace "a+b" but not "a(x)+b".

Looking at your next post, the only modification is if the variable represents 
a string, in which case you'd replace it with the value surrounded by quotes.

The one complication, of course, is when you have quote-enclosed strings. The 
algorithm has to properly replace "a+b" with "3+4", and "a+""a+b""" with 
"""some string""+""a+b""".

I'm thinking of something that would keep drilling down into the paren-enclosed 
expressions and replacing variables with the values they represent as constants 
as needed. Generally, the steps I'm thinking of are:

Swap the scalars for their values.Start outer loop.Start inner loop.Swap 
double-index arrays for their values where indexes are values.Swap single-index 
arrays for their values where indexes are values.End inner loop when no more 
changes can be made.Swap functions with their results where argument is a 
constant (value or quote-enclosed string).Swap expressions with their values 
where each side of the expression is a constant. (This one needs to be fleshed 
out more.)Drill down into each paren-enclosed expression and recursively repeat 
the above starting at step 2.End outer loop when no more changes can be made.
Have I missed something? If not, I'm thinking of something that might do the 
trick, but need some time to flesh it out in my head.      
_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.
  
                             Top                DaveS          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 11:30 am                    
             
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                That looks good .... I have step #1 
already accomplished...

and at this point I have an ARRAY with each "token" from the equation in it..

So something like this "SIN(x)*3+A+Q(3)" has become

SIN
(
19 --- assuming this is value of X
)
*
3
+
42  --- assuming this is value of A
+
Q  -- not replaced (yet) because it is an ARRAY
(
3
)


And no worries about double double quoted strings or strings with escape 
characters....  strings will be simple 

s="This is a String"      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind  
                             Top                ktekinay          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 11:46 am                    
             
Joined: Mon Feb 05, 2007 5:21 pm
Posts: 520
Location: New York, NY                Does your current algorithm will also 
properly break down Left("xy",1) into:

Left
(
"
x
y
"
,
1
)

Or (even better)

Left
(
"
xy
"
,
1
)      
_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.
  
                             Top                DaveS          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 11:55 am                    
             
Joined: Sun Aug 05, 2007 10:46 am
Posts: 4651
Location: San Diego, CA                no.. is shows

LEFT
(
"xy"
,
1
)

it recognizes quoted strings as a single token....  and had "xy" been a string 
variable name it would have inserted a the QUOTED value of that string      
_________________
Dave Sisemore
MacPro, OSX Lion 10.7.4 RB2012r1
Note : I am not  interested in any solutions that involve custom Plug-ins of 
any kind  
                             Top                ktekinay          Post subject: 
Re: Shunting Yard AlgorithmPosted: Sun Mar 10, 2013 12:10 pm                    
             
Joined: Mon Feb 05, 2007 5:21 pm
Posts: 520
Location: New York, NY                Awesome.

It seems like the job just got much easier. Each token can be classified as a 
value, operator, or label-with-parens, so you can retokenize based on that. I'm 
thinking of a recursive function that takes your "raw" token array and the 
position of the opening paren, then returns the closing position of the paren 
with the value between them. Where it's a multi-argument label, the comma can 
be treated specially. At each stage, your series of raw tokens will turn into a 
series of values and operators, and then you can evaluate it, because, 
ultimately, the innermost, paren-enclosed expressions have to be able to be 
evaluated, so you drill down until you get there, then keep working outward.

If it helps, this is working perfectly in my head.     
_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.
  
                             Top             Display posts from previous: All 
posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost 
timeSubject AscendingDescending          Page 1 of 2
   [ 25 posts ]   Go to page 1, 2  Next     
-- 
Over 1500 classes with 29000 functions in one REALbasic plug-in collection. 
The Monkeybread Software Realbasic Plugin v9.3. 
http://www.monkeybreadsoftware.de/realbasic/plugins.shtml

[email protected]

Reply via email to