Re: bean-query grep() function

2018-11-18 Thread shreedharhardikar


On Thursday, November 15, 2018 at 12:17:28 AM UTC-6, Martin Blais wrote:
>
> Thanks Shreedhar.
> I'll merge your patch in the next few weeks (I have some time off coming 
> up).
>
>
Sounds good. I was also able to figure out the PR system with hg/bitbucket, 
so I converted this into a PR: 
https://bitbucket.org/blais/beancount/pull-requests/85/support-subgroup-selection-using-grepn/diff
 
- so that it's easier to review and merge. More PRs on their way.

Also, in case this makes any difference, I'm much much more comfortable 
with git - which means at this time hg kinda just gets in my way. It should 
get easier as I figure mecurial's model.

Cheers.

-Shreedhar 

 

>
> On Mon, Nov 12, 2018 at 9:59 PM Shreedhar Hardikar  > wrote:
>
>> Well, I though I'd implement it anyway:
>>
>> GREPN(pattern, string, N):
>> The pattern string can contain a parenthesized subgroup.  The function 
>> returns
>> only the Nth subgroup of the matched string. N=0 returns the matched 
>> portion of
>> the string, ignoring any parenthesized subgroups.
>>
>> I guess one could put the pattern as a metadata item and then use this 
>> function to select the subgroup.
>>  
>>
>>
>> On Mon, Nov 12, 2018 at 8:41 PM Shreedhar Hardikar > > wrote:
>>
>>> On Mon, Nov 12, 2018 at 7:47 PM Martin Blais >> > wrote:
>>>
 On Mon, Nov 12, 2018 at 4:47 PM > 
 wrote:

> Hi,
>
> Just noticed that the implementation of the grep function in 
> bean-query doesn't make sense to me from the description:
>
> class Grep(query_compile.EvalFunction):
> "Match a group against a string and return only the matched 
> portion."
> __intypes__ = [str, str]
>
>
> def __init__(self, operands):
> super().__init__(operands, str)
>
>
> def __call__(self, context):
> args = self.eval_args(context)
> match = re.search(args[0], args[1])
> if match:
> return match.group(0)
>
> According to the description I think it should do:
> if match:
> # Get the first matched group; group(0) matches entire 
> string
> return match.group(1)
>
>
> or even:
> if match:
> # Get the last matched group or entire string if there 
> are no groups
> return match.group(len(match.groups))
>
> Reference: https://docs.python.org/3/library/re.html#match-objects
>

 bergamot [hg|default]:~/p/invest/options$ python3
 Python 3.7.0 (default, Jul 30 2018, 01:44:42) 
 [GCC 7.3.0] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import re
 >>> re.search('a+', 'cccaadd').group(0)
 'aa'
 >>> 

 'aa' is the matched portion.
 WAI


>>> Ah, so you had no use for the subgroups - which are captured using the 
>>> parentheses? 
>>>  
>>>


 If it is implemented as intended, I suppose it would be nice to have an 
> overloaded grep() function that takes a 3rd parameter of type int, for 
> the 
> group id. I can send a patch for that if you prefer that, although I 
> think 
> the second implementation should work for both styles:
>
> >>> import re
> >>> m = re.search('a (b) c', 'asda b c')
> >>> m.group(len(m.groups()))
>
> 'b'
> >>> m = re.search('a b c', 'asda b c')
>
> >>> m.group(len(m.groups()))
> 'a b c'
>
> Thanks,
> Shreedhar
>

 Patches welcome.


  
>>> Anyway, the one-line change I suggested would work for your scenario 
>>> also. Basically, if there're no parent groups, it'll just return the 
>>> matched portion (as it does now) and otherwise it'll return the last group 
>>> in the pattern string. Not sure it makes sense to add an integer selector - 
>>> since GREP returns only one string so why would have a pattern with 
>>> multiple subgroups.
>>>
>>> Anyway, I've attached a patch with some tests.
>>>
>>> - Shreedhar
>>>
>>>  
>>>


 -- 
> You received this message because you are subscribed to the Google 
> Groups "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to beancount+...@googlegroups.com .
> To post to this group, send email to bean...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/beancount/5b9484fc-1bd0-4e8c-81b4-f6caa2877cea%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
 -- 
 You received this message because you are subscribed to the Google 
 Groups "Beancount" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an 

Re: bean-query grep() function

2018-11-14 Thread Martin Blais
Thanks Shreedhar.
I'll merge your patch in the next few weeks (I have some time off coming
up).


On Mon, Nov 12, 2018 at 9:59 PM Shreedhar Hardikar <
shreedharhardi...@gmail.com> wrote:

> Well, I though I'd implement it anyway:
>
> GREPN(pattern, string, N):
> The pattern string can contain a parenthesized subgroup.  The function
> returns
> only the Nth subgroup of the matched string. N=0 returns the matched
> portion of
> the string, ignoring any parenthesized subgroups.
>
> I guess one could put the pattern as a metadata item and then use this
> function to select the subgroup.
>
>
>
> On Mon, Nov 12, 2018 at 8:41 PM Shreedhar Hardikar <
> shreedharhardi...@gmail.com> wrote:
>
>> On Mon, Nov 12, 2018 at 7:47 PM Martin Blais  wrote:
>>
>>> On Mon, Nov 12, 2018 at 4:47 PM  wrote:
>>>
 Hi,

 Just noticed that the implementation of the grep function in bean-query
 doesn't make sense to me from the description:

 class Grep(query_compile.EvalFunction):
 "Match a group against a string and return only the matched
 portion."
 __intypes__ = [str, str]


 def __init__(self, operands):
 super().__init__(operands, str)


 def __call__(self, context):
 args = self.eval_args(context)
 match = re.search(args[0], args[1])
 if match:
 return match.group(0)

 According to the description I think it should do:
 if match:
 # Get the first matched group; group(0) matches entire
 string
 return match.group(1)


 or even:
 if match:
 # Get the last matched group or entire string if there are
 no groups
 return match.group(len(match.groups))

 Reference: https://docs.python.org/3/library/re.html#match-objects

>>>
>>> bergamot [hg|default]:~/p/invest/options$ python3
>>> Python 3.7.0 (default, Jul 30 2018, 01:44:42)
>>> [GCC 7.3.0] on linux
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> >>> import re
>>> >>> re.search('a+', 'cccaadd').group(0)
>>> 'aa'
>>> >>>
>>>
>>> 'aa' is the matched portion.
>>> WAI
>>>
>>>
>> Ah, so you had no use for the subgroups - which are captured using the
>> parentheses?
>>
>>
>>>
>>>
>>> If it is implemented as intended, I suppose it would be nice to have an
 overloaded grep() function that takes a 3rd parameter of type int, for the
 group id. I can send a patch for that if you prefer that, although I think
 the second implementation should work for both styles:

 >>> import re
 >>> m = re.search('a (b) c', 'asda b c')
 >>> m.group(len(m.groups()))

 'b'
 >>> m = re.search('a b c', 'asda b c')

 >>> m.group(len(m.groups()))
 'a b c'

 Thanks,
 Shreedhar

>>>
>>> Patches welcome.
>>>
>>>
>>>
>> Anyway, the one-line change I suggested would work for your scenario
>> also. Basically, if there're no parent groups, it'll just return the
>> matched portion (as it does now) and otherwise it'll return the last group
>> in the pattern string. Not sure it makes sense to add an integer selector -
>> since GREP returns only one string so why would have a pattern with
>> multiple subgroups.
>>
>> Anyway, I've attached a patch with some tests.
>>
>> - Shreedhar
>>
>>
>>
>>>
>>>
>>> --
 You received this message because you are subscribed to the Google
 Groups "Beancount" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to beancount+unsubscr...@googlegroups.com.
 To post to this group, send email to beancount@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/beancount/5b9484fc-1bd0-4e8c-81b4-f6caa2877cea%40googlegroups.com
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Beancount" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to beancount+unsubscr...@googlegroups.com.
>>> To post to this group, send email to beancount@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/beancount/CAK21%2BhPBsqHP_9DSC8jnSTDKMiEkhtYuJcmG9_KaStfqBxSpVw%40mail.gmail.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to 

Re: bean-query grep() function

2018-11-12 Thread Shreedhar Hardikar
Well, I though I'd implement it anyway:

GREPN(pattern, string, N):
The pattern string can contain a parenthesized subgroup.  The function
returns
only the Nth subgroup of the matched string. N=0 returns the matched
portion of
the string, ignoring any parenthesized subgroups.

I guess one could put the pattern as a metadata item and then use this
function to select the subgroup.



On Mon, Nov 12, 2018 at 8:41 PM Shreedhar Hardikar <
shreedharhardi...@gmail.com> wrote:

> On Mon, Nov 12, 2018 at 7:47 PM Martin Blais  wrote:
>
>> On Mon, Nov 12, 2018 at 4:47 PM  wrote:
>>
>>> Hi,
>>>
>>> Just noticed that the implementation of the grep function in bean-query
>>> doesn't make sense to me from the description:
>>>
>>> class Grep(query_compile.EvalFunction):
>>> "Match a group against a string and return only the matched
>>> portion."
>>> __intypes__ = [str, str]
>>>
>>>
>>> def __init__(self, operands):
>>> super().__init__(operands, str)
>>>
>>>
>>> def __call__(self, context):
>>> args = self.eval_args(context)
>>> match = re.search(args[0], args[1])
>>> if match:
>>> return match.group(0)
>>>
>>> According to the description I think it should do:
>>> if match:
>>> # Get the first matched group; group(0) matches entire
>>> string
>>> return match.group(1)
>>>
>>>
>>> or even:
>>> if match:
>>> # Get the last matched group or entire string if there are
>>> no groups
>>> return match.group(len(match.groups))
>>>
>>> Reference: https://docs.python.org/3/library/re.html#match-objects
>>>
>>
>> bergamot [hg|default]:~/p/invest/options$ python3
>> Python 3.7.0 (default, Jul 30 2018, 01:44:42)
>> [GCC 7.3.0] on linux
>> Type "help", "copyright", "credits" or "license" for more information.
>> >>> import re
>> >>> re.search('a+', 'cccaadd').group(0)
>> 'aa'
>> >>>
>>
>> 'aa' is the matched portion.
>> WAI
>>
>>
> Ah, so you had no use for the subgroups - which are captured using the
> parentheses?
>
>
>>
>>
>> If it is implemented as intended, I suppose it would be nice to have an
>>> overloaded grep() function that takes a 3rd parameter of type int, for the
>>> group id. I can send a patch for that if you prefer that, although I think
>>> the second implementation should work for both styles:
>>>
>>> >>> import re
>>> >>> m = re.search('a (b) c', 'asda b c')
>>> >>> m.group(len(m.groups()))
>>>
>>> 'b'
>>> >>> m = re.search('a b c', 'asda b c')
>>>
>>> >>> m.group(len(m.groups()))
>>> 'a b c'
>>>
>>> Thanks,
>>> Shreedhar
>>>
>>
>> Patches welcome.
>>
>>
>>
> Anyway, the one-line change I suggested would work for your scenario also.
> Basically, if there're no parent groups, it'll just return the matched
> portion (as it does now) and otherwise it'll return the last group in the
> pattern string. Not sure it makes sense to add an integer selector - since
> GREP returns only one string so why would have a pattern with multiple
> subgroups.
>
> Anyway, I've attached a patch with some tests.
>
> - Shreedhar
>
>
>
>>
>>
>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Beancount" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to beancount+unsubscr...@googlegroups.com.
>>> To post to this group, send email to beancount@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/beancount/5b9484fc-1bd0-4e8c-81b4-f6caa2877cea%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Beancount" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to beancount+unsubscr...@googlegroups.com.
>> To post to this group, send email to beancount@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/beancount/CAK21%2BhPBsqHP_9DSC8jnSTDKMiEkhtYuJcmG9_KaStfqBxSpVw%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To post to this group, send email to beancount@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAAY9sD-MjrNtCgnkeS9RS2aT0XLUEpRXUt0_ej6g2W3TrdoFzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


grepn-subgroup.patch
Description: Binary data


Re: bean-query grep() function

2018-11-12 Thread Shreedhar Hardikar
On Mon, Nov 12, 2018 at 7:47 PM Martin Blais  wrote:

> On Mon, Nov 12, 2018 at 4:47 PM  wrote:
>
>> Hi,
>>
>> Just noticed that the implementation of the grep function in bean-query
>> doesn't make sense to me from the description:
>>
>> class Grep(query_compile.EvalFunction):
>> "Match a group against a string and return only the matched portion."
>> __intypes__ = [str, str]
>>
>>
>> def __init__(self, operands):
>> super().__init__(operands, str)
>>
>>
>> def __call__(self, context):
>> args = self.eval_args(context)
>> match = re.search(args[0], args[1])
>> if match:
>> return match.group(0)
>>
>> According to the description I think it should do:
>> if match:
>> # Get the first matched group; group(0) matches entire string
>> return match.group(1)
>>
>>
>> or even:
>> if match:
>> # Get the last matched group or entire string if there are
>> no groups
>> return match.group(len(match.groups))
>>
>> Reference: https://docs.python.org/3/library/re.html#match-objects
>>
>
> bergamot [hg|default]:~/p/invest/options$ python3
> Python 3.7.0 (default, Jul 30 2018, 01:44:42)
> [GCC 7.3.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import re
> >>> re.search('a+', 'cccaadd').group(0)
> 'aa'
> >>>
>
> 'aa' is the matched portion.
> WAI
>
>
Ah, so you had no use for the subgroups - which are captured using the
parentheses?


>
>
> If it is implemented as intended, I suppose it would be nice to have an
>> overloaded grep() function that takes a 3rd parameter of type int, for the
>> group id. I can send a patch for that if you prefer that, although I think
>> the second implementation should work for both styles:
>>
>> >>> import re
>> >>> m = re.search('a (b) c', 'asda b c')
>> >>> m.group(len(m.groups()))
>>
>> 'b'
>> >>> m = re.search('a b c', 'asda b c')
>>
>> >>> m.group(len(m.groups()))
>> 'a b c'
>>
>> Thanks,
>> Shreedhar
>>
>
> Patches welcome.
>
>
>
Anyway, the one-line change I suggested would work for your scenario also.
Basically, if there're no parent groups, it'll just return the matched
portion (as it does now) and otherwise it'll return the last group in the
pattern string. Not sure it makes sense to add an integer selector - since
GREP returns only one string so why would have a pattern with multiple
subgroups.

Anyway, I've attached a patch with some tests.

- Shreedhar



>
>
> --
>> You received this message because you are subscribed to the Google Groups
>> "Beancount" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to beancount+unsubscr...@googlegroups.com.
>> To post to this group, send email to beancount@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/beancount/5b9484fc-1bd0-4e8c-81b4-f6caa2877cea%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to beancount+unsubscr...@googlegroups.com.
> To post to this group, send email to beancount@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beancount/CAK21%2BhPBsqHP_9DSC8jnSTDKMiEkhtYuJcmG9_KaStfqBxSpVw%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To post to this group, send email to beancount@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAAY9sD-n0rkK_HDgbchwDzQaDJRT3c9tJu_skGHGdnJj0oahVg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


grep-subgroup.patch
Description: Binary data


Re: bean-query grep() function

2018-11-12 Thread Martin Blais
On Mon, Nov 12, 2018 at 4:47 PM  wrote:

> Hi,
>
> Just noticed that the implementation of the grep function in bean-query
> doesn't make sense to me from the description:
>
> class Grep(query_compile.EvalFunction):
> "Match a group against a string and return only the matched portion."
> __intypes__ = [str, str]
>
>
> def __init__(self, operands):
> super().__init__(operands, str)
>
>
> def __call__(self, context):
> args = self.eval_args(context)
> match = re.search(args[0], args[1])
> if match:
> return match.group(0)
>
> According to the description I think it should do:
> if match:
> # Get the first matched group; group(0) matches entire string
> return match.group(1)
>
>
> or even:
> if match:
> # Get the last matched group or entire string if there are no
> groups
> return match.group(len(match.groups))
>
> Reference: https://docs.python.org/3/library/re.html#match-objects
>

bergamot [hg|default]:~/p/invest/options$ python3
Python 3.7.0 (default, Jul 30 2018, 01:44:42)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.search('a+', 'cccaadd').group(0)
'aa'
>>>

'aa' is the matched portion.
WAI



If it is implemented as intended, I suppose it would be nice to have an
> overloaded grep() function that takes a 3rd parameter of type int, for the
> group id. I can send a patch for that if you prefer that, although I think
> the second implementation should work for both styles:
>
> >>> import re
> >>> m = re.search('a (b) c', 'asda b c')
> >>> m.group(len(m.groups()))
>
> 'b'
> >>> m = re.search('a b c', 'asda b c')
>
> >>> m.group(len(m.groups()))
> 'a b c'
>
> Thanks,
> Shreedhar
>

Patches welcome.




-- 
> You received this message because you are subscribed to the Google Groups
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to beancount+unsubscr...@googlegroups.com.
> To post to this group, send email to beancount@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beancount/5b9484fc-1bd0-4e8c-81b4-f6caa2877cea%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To post to this group, send email to beancount@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAK21%2BhPBsqHP_9DSC8jnSTDKMiEkhtYuJcmG9_KaStfqBxSpVw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.