Re: Parsing a search string

2004-12-31 Thread Fuzzyman
That's not bad going considering you've only run out of alcohol at 6 in
the morning and *then* ask python questions.

Anyway - you could write a charcter-by-character parser function that
would do that in a few minutes...

My 'listquote' module has one - but it splits on commas not whitespace.
Sounds like you're looking for a one-liner though regular
expressions *could* do it...

Regards,

Fuzzy
http://www.voidspace.org.uk/atlantibots/pythonutils.html#llistquote

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread Reinhold Birkenfeld
Freddie wrote:
 Happy new year! Since I have run out of alcohol, I'll ask a question that I 
 haven't really worked out an answer for yet. Is there an elegant way to turn 
 something like:
 
   moo cow farmer john -zug
 
 into:
 
 ['moo', 'cow', 'farmer john'], ['zug']
 
 I'm trying to parse a search string so I can use it for SQL WHERE 
 constraints, 
 preferably without horrifying regular expressions. Uhh yeah.

The shlex approach, finished:

searchstring = 'moo cow farmer john -zug'
lexer = shlex.shlex(searchstring)
lexer.wordchars += '-'
poslist, neglist = [], []
while 1:
token = lexer.get_token()
# token is '' on eof
if not token: break
# remove quotes
if token[0] in '\'':
token = token[1:-1]
# select in which list to put it
if token[0] == '-':
neglist.append(token[1:])
else:
poslist.append(token)

regards,
Reinhold
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread It's me
I am right in the middle of doing text parsing so I used your example as a
mental exercise.   :-)

Here's a NDFA for your text:

   b  0 1-9 a-Z ,  . +  -   '\n
S0: S0 E   E  S1  E E E S3 E S2  E
S1: T1 E   E  S1  E E E  E  E  E T1
S2: S2 E   E  S2  E E E  E  E T2  E
S3: T3 E   E  S3  E E E  E  E  E T3

and the end-states are:

E: error in text
T1: You have the words: moo, cow
T2: You get farmer john (w quotes)
T3: You get zug

Can't gurantee that I did it right - I did it really quick - and it's
*specific* to your text string.

Now just need to hire a programmer to write some clean Python parsing code.
:-)

--
It's me





Freddie [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Happy new year! Since I have run out of alcohol, I'll ask a question that
I
 haven't really worked out an answer for yet. Is there an elegant way to
turn
 something like:

   moo cow farmer john -zug

 into:

 ['moo', 'cow', 'farmer john'], ['zug']

 I'm trying to parse a search string so I can use it for SQL WHERE
constraints,
 preferably without horrifying regular expressions. Uhh yeah.

  From 2005,
Freddie




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread M.E.Farmer
Ah! that is what the __future__ brings I guess.
Damn that progress making me outdated ;)
Python 2.2.3 ( a lot of  extensions I use are stuck there , so I still
use it)
M.E.Farmer

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread Reinhold Birkenfeld
M.E.Farmer wrote:
 Ah! that is what the __future__ brings I guess.
 Damn that progress making me outdated ;)
 Python 2.2.3 ( a lot of  extensions I use are stuck there , so I still
 use it)

I'm also positively surprised how many cute little additions are there
every new Python version. Great thanks to the great devs!

Reinhold
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread It's me

Andrew Dalke [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 It's me wrote:
  Here's a NDFA for your text:
 
 b  0 1-9 a-Z ,  . +  -   '\n
  S0: S0 E   E  S1  E E E S3 E S2  E
  S1: T1 E   E  S1  E E E  E  E  E T1
  S2: S2 E   E  S2  E E E  E  E T2  E
  S3: T3 E   E  S3  E E E  E  E  E T3

 Now if I only had an NDFA for parsing that syntax...


Just finished one (don't ask me to show it - very clumpsy Python code -
still in learning mode).   :)

Here's one for parsing integer:

#   b 0 1-9 , . + - '  a-Z \n
# S0: S0 S0 S1 T0 E S2 S2 E E E   T0
# S1: S3 S1 S1 T1 E E E E E E   T1
# S2: E S2 S1 E E E E E E E   E
# S3: S3 T2 T2 T1 T2 T2 T2 T2 T2 T2  T1

T0: you got a null token
T1: you got a good token, separator was ,
T2: you got a good token b, separator was  
E: bad token


   :)
 Andrew
 [EMAIL PROTECTED]



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread Brian Beck
Freddie wrote:
I'm trying to parse a search string so I can use it for SQL WHERE 
constraints, preferably without horrifying regular expressions. Uhh yeah.
If you're interested, I've written a function that parses query strings 
using a customizable version of Google's search syntax.

Features include:
  - Binary operators like OR
  - Unary operators like '-' for exclusion
  - Customizable modifiers like Google's site:, intitle:, inurl: syntax
  - *No* query is an error (invalid characters are fixed up, etc.)
  - Result is a dictionary in one of two possible forms, both geared 
towards being input to an search method for your database

I'd be glad to post the code, although I'd probably want to have a last 
look at it before I let others see it...

--
Brian Beck
Adventurer of the First Order
--
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread John Machin
Andrew Dalke wrote:
 It's me wrote:
  Here's a NDFA for your text:
 
 b  0 1-9 a-Z ,  . +  -   '\n
  S0: S0 E   E  S1  E E E S3 E S2  E
  S1: T1 E   E  S1  E E E  E  E  E T1
  S2: S2 E   E  S2  E E E  E  E T2  E
  S3: T3 E   E  S3  E E E  E  E  E T3

 Now if I only had an NDFA for parsing that syntax...

Parsing your sentence as written (if I only had): If you were the
sole keeper of the secret??

Parsing it as intended (if only I had), and ignoring the smiley:
Looks like a fairly straight-forward state-transition table to me. The
column headings are not aligned properly in the message, b means blank,
a-Z is bletchworthy, but the da Vinci code it ain't.

If only we had an NDFA (whatever that is) for guessing what acronyms
mean ...

Where I come from:
DFA = deterministic finite-state automaton
NFA = non-det..
SFA = content-free
NFI = concept-free
NDFA = National Dairy Farmers' Association

HTH, and Happy New Year!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread It's me

John Machin [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Andrew Dalke wrote:
  It's me wrote:
   Here's a NDFA for your text:
  
  b  0 1-9 a-Z ,  . +  -   '\n
   S0: S0 E   E  S1  E E E S3 E S2  E
   S1: T1 E   E  S1  E E E  E  E  E T1
   S2: S2 E   E  S2  E E E  E  E T2  E
   S3: T3 E   E  S3  E E E  E  E  E T3
 
  Now if I only had an NDFA for parsing that syntax...

 Parsing your sentence as written (if I only had): If you were the
 sole keeper of the secret??

 Parsing it as intended (if only I had), and ignoring the smiley:
 Looks like a fairly straight-forward state-transition table to me.

Exactly.

 The
 column headings are not aligned properly in the message, b means blank,
 a-Z is bletchworthy, but the da Vinci code it ain't.

 If only we had an NDFA (whatever that is) for guessing what acronyms
 mean ...


I believe  (I am not a computer science major):

NDFA = non-deterministic finite automata

and:

S: state
T: terminal
E: error

So, S1 means State #1..T1 means Terminal #1, so forth

You are correct that parsing that table is not hard.

a) Set up a stack and place the buffer onto the stack, start with S0
b) For each character that comes from the stack, looking up the next state
for that token
c) If it's not a T or E state, jump to that state
d) If it's a T or E state, finish


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing a search string

2004-12-31 Thread Freddie
Reinhold Birkenfeld wrote:
Freddie wrote:
Happy new year! Since I have run out of alcohol, I'll ask a question that I 
haven't really worked out an answer for yet. Is there an elegant way to turn 
something like:

 moo cow farmer john -zug
into:
['moo', 'cow', 'farmer john'], ['zug']
I'm trying to parse a search string so I can use it for SQL WHERE constraints, 
preferably without horrifying regular expressions. Uhh yeah.

The shlex approach, finished:
searchstring = 'moo cow farmer john -zug'
lexer = shlex.shlex(searchstring)
lexer.wordchars += '-'
poslist, neglist = [], []
while 1:
token = lexer.get_token()
# token is '' on eof
if not token: break
# remove quotes
if token[0] in '\'':
token = token[1:-1]
# select in which list to put it
if token[0] == '-':
neglist.append(token[1:])
else:
poslist.append(token)
regards,
Reinhold
Thanks for this, though there was one issue:
 lexer = shlex.shlex('moo cow +farmer john -dog')
 lexer.wordchars += '-+'
 while 1:
... tok = lexer.get_token()
... if not tok: break
... print tok
...
moo
cow
+farmer
john
-dog
The '+farmer john' part would be turned into two seperate words, '+farmer' 
and 'john'. I ended up using shlex.split() (which the docs say is new in 
Python 2.3), which gives me the desired result. Thanks for the help from 
yourself and M.E.Farmer :)

Freddie
 shlex.split('moo cow +farmer john -evil dog')
['moo', 'cow', '+farmer john', '-evil dog']
 shlex.split('moo cow +farmer john -evil dog +elephant')
['moo', 'cow', '+farmer john', '-evil dog', '+elephant']
--
http://mail.python.org/mailman/listinfo/python-list