Re: [Factor-talk] regexp words on string slices

2012-11-30 Thread Naveen Garg
Awesome.  Thanks John.
For my specific project, the current weaker link is memory usage compared
to speed.


On Fri, Nov 30, 2012 at 8:03 PM, John Benediktsson  wrote:

> Probably the restrictions to strings were a performance optimization...?
>  It should be pretty easy to get the behavior you want:
>
> IN: scratchpad 0 4 "foo "  R/ foo/ first-match .
> T{ slice { from 0 } { to 3 } { seq "foo " } }
>
> Using this diff makes it work, but causes the regexp benchmark to be
> 20-30% slower...:
>
>
--
Keep yourself connected to Go Parallel: 
INSIGHTS What's next for parallel hardware, programming and related areas?
Interviews and blogs by thought leaders keep you ahead of the curve.
http://goparallel.sourceforge.net___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] regexp words on string slices

2012-11-30 Thread John Benediktsson
Probably the restrictions to strings were a performance optimization...?
 It should be pretty easy to get the behavior you want:

IN: scratchpad 0 4 "foo "  R/ foo/ first-match .
T{ slice { from 0 } { to 3 } { seq "foo " } }

Using this diff makes it work, but causes the regexp benchmark to be 20-30%
slower...:

```diff
diff --git a/basis/regexp/compiler/compiler.factor
b/basis/regexp/compiler/compi
index a8b3c91..fb5fa4f 100644
--- a/basis/regexp/compiler/compiler.factor
+++ b/basis/regexp/compiler/compiler.factor
@@ -101,7 +101,7 @@ C:  box
 : transitions>quot ( transitions final-state? -- quot )
 dup shortest? get and [ 2drop [ drop nip ] ] [
 [ split-literals swap case>quot ] dip backwards? get
-'[ { fixnum string } declare _ _ _ step ]
+'[ { fixnum sequence } declare _ _ _ step ]
 ] if ;

 : word>quot ( word dfa -- quot )
diff --git a/basis/regexp/regexp.factor b/basis/regexp/regexp.factor
index 6070921..3e66785 100644
--- a/basis/regexp/regexp.factor
+++ b/basis/regexp/regexp.factor
@@ -29,7 +29,7 @@ M: lookbehind question>quot ! Returns ( index string -- ?
)

 : check-string ( string -- string )
 ! Make this configurable
-dup string? [ "String required" throw ] unless ;
+dup sequence? [ "String required" throw ] unless ;

 : match-index-from ( i string regexp -- index/f )
 ! This word is unsafe. It assumes that i is a fixnum
@@ -166,7 +166,7 @@ DEFER: compile-next-match
 dup '[
 dup \ next-initial-word = [
 drop _ [ compile-regexp dfa>> def>> ] [ reverse-regexp? ] bi
-'[ { array-capacity string regexp } declare _ _ next-match ]
+'[ { array-capacity sequence regexp } declare _ _ next-match ]
 ( i string regexp -- start end string ) define-temp
 ] when
 ] change-next-match ;
```


On Fri, Nov 30, 2012 at 5:43 PM, Naveen Garg  wrote:

> Any reason slices are not allowed to be passed to regexp words like
> "first-match" ?
> I tried modifiying the word:
> : check-string ( string -- string )
> ! Make this configurable
> !  dup string? [ "String required" throw ] unless ;
> dup dup string?
> swap regexp? or [ "String required" throw ] unless ;
>
> but doing a
> > refresh-all
> doesn't change anything, running
> > 0 4 "foo "  R/ foo/ first-match
> still throws error " string required "
>
> For the curious, I am working on an optimized / space efficient version of
> the re-pair semi-static dictionary compression algorithm (
> http://www.cbrc.jp/~rwan/en/restore.html) .
> I have it working in autohotkey, trying to port it to factor to see if I
> get any performance gains, and just for fun / learn factor.
>
>
> --
> Keep yourself connected to Go Parallel:
> INSIGHTS What's next for parallel hardware, programming and related areas?
> Interviews and blogs by thought leaders keep you ahead of the curve.
> http://goparallel.sourceforge.net
> ___
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/factor-talk
>
>
--
Keep yourself connected to Go Parallel: 
INSIGHTS What's next for parallel hardware, programming and related areas?
Interviews and blogs by thought leaders keep you ahead of the curve.
http://goparallel.sourceforge.net___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] regexp words on string slices

2012-11-30 Thread Naveen Garg
Oops, I had a typo, i was checking for regexp? instead of slice?
: check-string ( string -- string )
! Make this configurable
!dup string? [ "String required" throw ] unless ;
dup dup string?
swap slice? or [ "String required" throw ] unless ;

> 0 4 "foo bar "  R/ foo/ first-match
but now I get a memory access error: ...
Naveen
--
Keep yourself connected to Go Parallel: 
INSIGHTS What's next for parallel hardware, programming and related areas?
Interviews and blogs by thought leaders keep you ahead of the curve.
http://goparallel.sourceforge.net___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk