Re: [racket-users] unicode (or just plain byte) regular expression positions?

2021-01-18 Thread Jon Zeppieri
On Mon, Jan 18, 2021 at 10:53 PM Tim Meehan  wrote:
>
> Say that I have a strange character group that I want to find in a binary 
> file.
> I wanted to use something like this:
>
> (define needle (list->string (map integer->char (list #xab #xcd #xef
> (define needle-offset
>   (call-with-input-file "big_binary_blob.bin"
> #:mode 'binary
> (λ (p)
>   (regexp-match-positions (regexp needle) p
>
> The "regexp-match-positions" returns #f (even though I know that needle is in 
> there, I put it there). Is there a better way to go about this? The binary 
> blob is about 100 MiB or so, if that helps.


You're searching for a certain unicode codepoint sequence (U+00AB,
U+00CD, U+00EF) in a string, but I think you're trying to search for a
byte sequence in a byte string. You can read in the file as bytes and
use a byte regexp. So:

(define needle (list->bytes (list #xab #xcd #xef)))
(define needle-offset
  (call-with-input-file "big_binary_blob.bin"
#:mode 'binary
(λ (p)
  (regexp-match-positions (byte-regexp needle) p

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CAKfDxxwCcTvg8sNJphkWHe%3DUDpV_c3dLKLReTX6kL%3DD2ZVOOAA%40mail.gmail.com.


[racket-users] unicode (or just plain byte) regular expression positions?

2021-01-18 Thread Tim Meehan
Say that I have a strange character group that I want to find in a binary
file.
I wanted to use something like this:

(define needle (list->string (map integer->char (list #xab #xcd #xef
(define needle-offset
  (call-with-input-file "big_binary_blob.bin"
#:mode 'binary
(λ (p)
  (regexp-match-positions (regexp needle) p

The "regexp-match-positions" returns #f (even though I know that needle is
in there, I put it there). Is there a better way to go about this? The
binary blob is about 100 MiB or so, if that helps.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CACgrOxJwV_xGV1P96F6rY%3D%2BBdXDRSjgRPUrz1BhUUiMXqqt3vA%40mail.gmail.com.