Hi,

On Thu, 2011-08-25 at 12:54 +0200, Paul Johnson wrote:
> On Thu, Aug 25, 2011 at 12:02:37PM +0200, Honza Mach wrote:
> > Hi,
> > 
> > On Thu, 2011-08-25 at 11:35 +0200, Paul Johnson wrote:
> > > 
> > >   say "extracted: ", /^(\d+\s+.{$len})/ if ($len) = /^(\d+)/
> > > Sometimes two passes are are better than one.
> > > 
> > 
> > Thank you for your advice, however I wanted to do it in one pass because
> > of the following reasons:
> > 
> > 1. I wanted to compile the pattern because of the performance
> 
> Ah.  Then you have already benchmarked a solution, found it to be too
> slow, profiled your code and discovered that this section of the code
> was the bottleneck in such a way that compiling the regualar expression
> would solve the problem?
> 
> Why on earth are you worried about performance?  Especially when your
> data is coming from a socket.  I suggest not worrying about performance
> until you need to.
> 
> Please don't think I'm geting at you personally here.  Perhaps you have
> already done this.  But in general there is far too much talk about
> speed on this list.  Make your code correct.  Make it nice.  Until that
> point performance cannot be a concern.

Point taken, you are right. I haven`t done any benchmarking so far, but
I wanted it to be as effective as possible, because the system I am
bulding needs to be capable of handling as many events per second as
possible and it seems I got caught in the over-optimizing trap every
developer should avoid ;)

> >    (I guess it is not possible to compile the pattern, when there are   
> >    some substitutions of the variables from outside of the RE engine, 
> >    or am I wrong?)
> 
> See below.
> 
> > 2. Perhaps I should have mentioned, that the application is parsing the 
> >    continuos string stream coming from a socket, there are many of 
> >    these "commands" coming one after each other and there will not 
> >    always be the complete command present in the buffer, next chunk 
> >    may come later. I was actually trying to use it this way:
> > 
> > my $string = "5 abcde 2 fg 3 hij 6 klm";
> > while ($string =~ s/\s*(\d+)\s+(.{\1})//)
> > {
> >     print "extracted: $1 $2\n";
> > } 
> > 
> >    This way the while loop elegantly fails, when the pattern does not 
> >    match and when the next data arrives, the check is made again.
> 
> OK.  So try this:
> 
>   say "extracted: $1 $2" while s/\s*(\d+)(?{$len = $^N})\s+((??{"." x 
> $len}))//
> 
> That uses features marked as experimental.  It would probably be a good
> idea to read perlre to understand them.  You can even compile that if
> you want to but I'm not sure what difference, if any, you will notice in
> doing so.

Thank you for the idea, I`ll try that.

> > I was just trying to find the most elegant solution, but if it is not
> > possible, I will create some work around, that is not the problem.
> 
> Some might class this as an elegant solution.  Others might take the
> opposite position.
> 
> Good luck,


Thanks for the advices

Honza Mach

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to