Re: ANN: Another binary parser combinator - this time for java's streams

2014-02-04 Thread Stathis Sideris
Thanks, header seems very useful and relevant to what I was doing, but I 
ended up doing something slightly different because I needed to include the 
information retrieved using the chunk header codec in the final result 
(specifically, the type of the chunk). Here is some code:

https://gist.github.com/stathissideris/8801295

select-codec is almost identical to header (didn't bother with writing in 
this case), but it also merges the result of the decision-codec with the 
result of the selected codec. Of course it's less generic than header 
because it makes the assumption that we're dealing with maps. Also, note 
the use of core.match to decide on what codec to use.

Stathis


On Monday, 3 February 2014 16:50:12 UTC, Steffen Dienst wrote:

 I would use header for this:

 (def chunk 
   (header :int-be
   #(ordered-map 
   :type (b/repeated :byte :length 4)
   :data (b/repeated :byte :length %)
   :crc (b/repeated :byte :length 4))
   #(count (:data %

 The resulting data structure would not contain the field length in this 
 case. Length only gets used to configure the inner codec for the body (the 
 map with :type, :data and :crc). You can read this codec as: Read a 
 big-endian integer, then use this value to construct a new codec to read 
 the body. When writing, count the :data field, write the length using :type 
 and then write the body.

 Steffen


 2014-02-03 Stathis Sideris sid...@gmail.com javascript::

 Hello,

 Is it possible to use 'repeated with a dynamic size if the 
 length-defining prefix does not directly precede the content? For example, 
 see PNG chunks:


 http://en.wikipedia.org/wiki/Portable_Network_Graphics#.22Chunks.22_within_the_file

 The codec would be:

 (def chunk
   (b/ordered-map
:length :int-be
:type (b/repeated :byte :length 4)
:data (b/repeated :byte :length ???)
:crc (b/repeated :byte :length 4)))

 What do I put in the place of ???

 Thanks,

 Stathis


 On Friday, 31 January 2014 08:12:23 UTC, Steffen Dienst wrote:

 Thanks, I fixed the documentation issues. Feel free to share your id3 
 tags parser, if you like :) You can see that mine is still stuck at the 
 very beginning..


  -- 
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clo...@googlegroups.comjavascript:
 Note that posts from new members are moderated - please be patient with 
 your first post.
 To unsubscribe from this group, send email to
 clojure+u...@googlegroups.com javascript:
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 --- 
 You received this message because you are subscribed to a topic in the 
 Google Groups Clojure group.
 To unsubscribe from this topic, visit 
 https://groups.google.com/d/topic/clojure/2c9-oXfKlp0/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 clojure+u...@googlegroups.com javascript:.
 For more options, visit https://groups.google.com/groups/opt_out.




-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-02-03 Thread Stathis Sideris
Hello,

Is it possible to use 'repeated with a dynamic size if the length-defining 
prefix does not directly precede the content? For example, see PNG chunks:

http://en.wikipedia.org/wiki/Portable_Network_Graphics#.22Chunks.22_within_the_file

The codec would be:

(def chunk
  (b/ordered-map
   :length :int-be
   :type (b/repeated :byte :length 4)
   :data (b/repeated :byte :length ???)
   :crc (b/repeated :byte :length 4)))

What do I put in the place of ???

Thanks,

Stathis


On Friday, 31 January 2014 08:12:23 UTC, Steffen Dienst wrote:

 Thanks, I fixed the documentation issues. Feel free to share your id3 tags 
 parser, if you like :) You can see that mine is still stuck at the very 
 beginning..




-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-02-03 Thread Steffen Dienst
I would use header for this:

(def chunk
  (header :int-be
  #(ordered-map
  :type (b/repeated :byte :length 4)
  :data (b/repeated :byte :length %)
  :crc (b/repeated :byte :length 4))
  #(count (:data %

The resulting data structure would not contain the field length in this
case. Length only gets used to configure the inner codec for the body (the
map with :type, :data and :crc). You can read this codec as: Read a
big-endian integer, then use this value to construct a new codec to read
the body. When writing, count the :data field, write the length using :type and
then write the body.

Steffen


2014-02-03 Stathis Sideris side...@gmail.com:

 Hello,

 Is it possible to use 'repeated with a dynamic size if the length-defining
 prefix does not directly precede the content? For example, see PNG chunks:


 http://en.wikipedia.org/wiki/Portable_Network_Graphics#.22Chunks.22_within_the_file

 The codec would be:

 (def chunk
   (b/ordered-map
:length :int-be
:type (b/repeated :byte :length 4)
:data (b/repeated :byte :length ???)
:crc (b/repeated :byte :length 4)))

 What do I put in the place of ???

 Thanks,

 Stathis


 On Friday, 31 January 2014 08:12:23 UTC, Steffen Dienst wrote:

 Thanks, I fixed the documentation issues. Feel free to share your id3
 tags parser, if you like :) You can see that mine is still stuck at the
 very beginning..


  --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to a topic in the
 Google Groups Clojure group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/clojure/2c9-oXfKlp0/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-01-31 Thread Steffen Dienst
Thanks, I fixed the documentation issues. Feel free to share your id3 tags 
parser, if you like :) You can see that mine is still stuck at the very 
beginning..


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-01-30 Thread Michael Gardner
On Jan 30, 2014, at 01:36 , Steffen steffen.die...@gmail.com wrote:

 If you would like to use a specific codec other than :byte or :ubyte but also 
 restrict the number of bytes read this would only work if you expected to 
 have some kind of optional padding after your objects, like: 
 
 (padding inner-codec 4096).

Yes, that's exactly what I need. I didn't try 'padding because the docs seemed 
to say that it works only when encoding.

My only problem is that when decoding, I don't know how many objects to expect 
before the padding (this is for parsing ID3v2 tags). Ideally I'd like to say 
something like (padding (repeated frame-codec) byte-count), with the padding 
taking over once the inner codec fails to parse the next available bytes (but 
see the next point).

 (defn enum [type m] 
 (compile-codec type m 
 (clojure.set/map-invert m))) 
 So m would be a map of for example keywords to a native datatype like int 
 that would allow you to represent a fixed number of things with distinct 
 binary representations? Looks good to me. What do you think should be the 
 behaviour in case of an unspecified value (not in m)?

I'd expect an exception to be thrown in case of an unspecified value. But when 
decoding, it would be nice if the exception were (optionally?) swallowed when 
occurring inside a 'padding construct, to allow something like the above 
example. Though I don't know how many other binary formats would require 
something like that; I imagine most aren't as dumb as ID3v2.

 Currently the index in the vector is the index of the bit. Yes, that means 
 LSB-first.

Then the docs seem to be wrong (or at least confusing), since the example code 
for 'bits says the first item corresponds to the highest bit.

-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-01-30 Thread Steffen Dienst


Am Donnerstag, 30. Januar 2014 14:05:07 UTC+1 schrieb Michael Gardner:

 On Jan 30, 2014, at 01:36 , Steffen steffen...@gmail.com javascript: 
 wrote: 

  If you would like to use a specific codec other than :byte or :ubyte but 
 also restrict the number of bytes read this would only work if you expected 
 to have some kind of optional padding after your objects, like: 
  
  (padding inner-codec 4096). 

 Yes, that's exactly what I need. I didn't try 'padding because the docs 
 seemed to say that it works only when encoding. 

My bad. I changed the readme. Padding will always read the given number of 
bytes before using the inner codec on those bytes. When writing it adds the 
needed amount of bytes to ensure that the expected number of bytes were 
written.

My only problem is that when decoding, I don't know how many objects to 
 expect before the padding (this is for parsing ID3v2 tags). Ideally I'd 
 like to say something like (padding (repeated frame-codec) byte-count), 
 with the padding taking over once the inner codec fails to parse the next 
 available bytes (but see the next point). 

 That's exactly what padding is designed to do: Let's say you know there is 
a run of bytes with a known length (from a header field maybe) and you want 
to parse an unbounded number of objects within this area. You could use

(padding (repeated inner-codec) 1024)

Another example: Let's assume an inputstream with these bytes: [11 5 0 0 0 
9 0 0 0 0x99 0x99 0x99]

;the padding length is determined by the byte header, the inner codec 
`repeated` can only read two integers (8 bytes)
(header :byte #(padding (repeated :int-le) % 0x99) (constantly 11)) 
= [5 9] ; now the inputstream will be empty
 

  (defn enum [type m] 
  (compile-codec type m 
  (clojure.set/map-invert m))) 
  So m would be a map of for example keywords to a native datatype like 
 int that would allow you to represent a fixed number of things with 
 distinct binary representations? Looks good to me. What do you think should 
 be the behaviour in case of an unspecified value (not in m)? 

 I'd expect an exception to be thrown in case of an unspecified value. But 
 when decoding, it would be nice if the exception were (optionally?) 
 swallowed when occurring inside a 'padding construct, to allow something 
 like the above example. Though I don't know how many other binary formats 
 would require something like that; I imagine most aren't as dumb as ID3v2. 

Currently codecs don't know about their context, that means, I can't behave 
differently depending on whether a codec is used within a padding or not, 
sorry.

 Currently the index in the vector is the index of the bit. Yes, that 
 means LSB-first. 

 Then the docs seem to be wrong (or at least confusing), since the example 
 code for 'bits says the first item corresponds to the highest bit.

Thanks, I fixed the documentation. 

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-01-30 Thread Michael Gardner
On Jan 30, 2014, at 08:10 , Steffen Dienst steffen.die...@gmail.com wrote:

 That's exactly what padding is designed to do: Let's say you know there is a 
 run of bytes with a known length (from a header field maybe) and you want to 
 parse an unbounded number of objects within this area. You could use
 
 (padding (repeated inner-codec) 1024)

Excellent.

 Currently codecs don't know about their context, that means, I can't behave 
 differently depending on whether a codec is used within a padding or not, 
 sorry.

It could work the other way around, with 'padding catching certain types of 
exceptions thrown by its inner codecs.

For example, when parsing something like (padding (repeated (constant 0x99)) 
len pad-byte), padding could catch the exception thrown by the constant codec 
and then use pad-byte to parse the remaining bytes.

But I can live without this, if it's too niche or too hard to implement.

 Then the docs seem to be wrong (or at least confusing), since the example 
 code for 'bits says the first item corresponds to the highest bit.
 Thanks, I fixed the documentation. 

A couple other things about the README:

The docs for 'header say that body-header should produce a codec that will be 
used to encode the header, but in testing I've had to make it return the header 
directly (which does make more sense).

Also, the expression #{:a :b:last} in the 'bits section is missing a space.

Thanks for all the help, by the way!

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


ANN: Another binary parser combinator - this time for java's streams

2014-01-29 Thread Steffen
Hello Clojure community,

there are already two excellent libraries for reading/writing/manipulating 
binary data: Zach's Lamina and Clojurewerkz' Buffy for java's ByteBuffers. 
I would like to offer another library for java's Input/OutputStreams. It is 
inspired by Lamina but not compatible in syntax, I'm sorry.
The focus is on 

   - read/write performance, 
   - no external dependencies
   - works with java.util.*Stream

If you use Leiningen please add the following to your dependencies:

[org.clojars.smee/binary 0.2.4]

The link to the source code and README is https://github.com/smee/binary.
Democode to parse Bitcoin blocks (including scripts): 
https://github.com/smee/binary/blob/master/src/org/clojars/smee/binary/demo/bitcoin.clj
Democode for MP3 ID3v2 tags (work in progress): 
https://github.com/smee/binary/blob/master/src/org/clojars/smee/binary/demo/mp3.clj

Apart from the README, doc strings there is no further documentation, yet. 
Please refer to the demos and the unit tests for now.

Thanks,

Steffen Dienst

-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-01-29 Thread Michael Gardner
Looks good! A few questions:

1) Is it possible to specify a byte length for a 'repeated codec, rather than a 
number of objects?

2) Would you consider an enum type, for convenience? Something like:

(defn enum [type m]
(compile-codec type m
(clojure.set/map-invert m)))

3) In the mp3.clj demo, the flags seem to be listed in the wrong order. Or does 
the 'bits function actually take its arguments LSB-first?

On Jan 29, 2014, at 06:32 , Steffen steffen.die...@gmail.com wrote:

 Hello Clojure community,
 
 there are already two excellent libraries for reading/writing/manipulating 
 binary data: Zach's Lamina and Clojurewerkz' Buffy for java's ByteBuffers. I 
 would like to offer another library for java's Input/OutputStreams. It is 
 inspired by Lamina but not compatible in syntax, I'm sorry.
 The focus is on 
   * read/write performance, 
   * no external dependencies
   * works with java.util.*Stream
 If you use Leiningen please add the following to your dependencies:
 
 [org.clojars.smee/binary 0.2.4]
 
 The link to the source code and README is https://github.com/smee/binary.
 Democode to parse Bitcoin blocks (including scripts): 
 https://github.com/smee/binary/blob/master/src/org/clojars/smee/binary/demo/bitcoin.clj
 Democode for MP3 ID3v2 tags (work in progress): 
 https://github.com/smee/binary/blob/master/src/org/clojars/smee/binary/demo/mp3.clj
 
 Apart from the README, doc strings there is no further documentation, yet. 
 Please refer to the demos and the unit tests for now.
 
 Thanks,
 
 Steffen Dienst
 
 -- 
 -- 
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with your 
 first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 --- 
 You received this message because you are subscribed to the Google Groups 
 Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.

-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ANN: Another binary parser combinator - this time for java's streams

2014-01-29 Thread Steffen
Please see below.

Am Mittwoch, 29. Januar 2014 17:49:56 UTC+1 schrieb Michael Gardner:

 Looks good! A few questions: 

Thanks.
 

 1) Is it possible to specify a byte length for a 'repeated codec, rather 
 than a number of objects?

If your 'object' is a byte,sure: 

(repeated :byte :length 1234)

If you would like to use a specific codec other than :byte or :ubyte but 
also restrict the number of bytes read this would only work if you expected 
to have some kind of optional padding after your objects, like: 

(padding inner-codec 4096).


 2) Would you consider an enum type, for convenience? Something like:  


 (defn enum [type m] 
 (compile-codec type m 
 (clojure.set/map-invert m))) 

So m would be a map of for example keywords to a native datatype like int 
that would allow you to represent a fixed number of things with distinct 
binary representations? Looks good to me. What do you think should be the 
behaviour in case of an unspecified value (not in m)?
 

 3) In the mp3.clj demo, the flags seem to be listed in the wrong order. Or 
 does the 'bits function actually take its arguments LSB-first? 

Currently the index in the vector is the index of the bit. Yes, that means 
LSB-first.


On Jan 29, 2014, at 06:32 , Steffen steffen...@gmail.com javascript: 
 wrote: 

  Hello Clojure community, 
  
  there are already two excellent libraries for 
 reading/writing/manipulating binary data: Zach's Lamina and Clojurewerkz' 
 Buffy for java's ByteBuffers. I would like to offer another library for 
 java's Input/OutputStreams. It is inspired by Lamina but not compatible in 
 syntax, I'm sorry. 
  The focus is on 
  • read/write performance, 
  • no external dependencies 
  • works with java.util.*Stream 
  If you use Leiningen please add the following to your dependencies: 
  
  [org.clojars.smee/binary 0.2.4] 
  
  The link to the source code and README is https://github.com/smee/binary. 

  Democode to parse Bitcoin blocks (including scripts): 
 https://github.com/smee/binary/blob/master/src/org/clojars/smee/binary/demo/bitcoin.clj
  
  Democode for MP3 ID3v2 tags (work in progress): 
 https://github.com/smee/binary/blob/master/src/org/clojars/smee/binary/demo/mp3.clj
  
  
  Apart from the README, doc strings there is no further documentation, 
 yet. Please refer to the demos and the unit tests for now. 
  
  Thanks, 
  
  Steffen Dienst 
  
  -- 
  -- 
  You received this message because you are subscribed to the Google 
  Groups Clojure group. 
  To post to this group, send email to clo...@googlegroups.comjavascript: 
  Note that posts from new members are moderated - please be patient with 
 your first post. 
  To unsubscribe from this group, send email to 
  clojure+u...@googlegroups.com javascript: 
  For more options, visit this group at 
  http://groups.google.com/group/clojure?hl=en 
  --- 
  You received this message because you are subscribed to the Google 
 Groups Clojure group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an email to clojure+u...@googlegroups.com javascript:. 
  For more options, visit https://groups.google.com/groups/opt_out. 



-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.