processes.texi

Richard M . Stallman Fri, 17 Jun 2005 07:20:20 -0700

Index: emacs/lispref/processes.texi
diff -c emacs/lispref/processes.texi:1.58 emacs/lispref/processes.texi:1.59
*** emacs/lispref/processes.texi:1.58   Sun May 15 20:42:11 2005
--- emacs/lispref/processes.texi        Fri Jun 17 13:51:19 2005
***************
*** 52,57 ****
--- 52,58 ----
  * Datagrams::                UDP network connections.
  * Low-Level Network::        Lower-level but more general function
                                 to create connections and servers.
+ * Byte Packing::             Using bindat to pack and unpack binary data.
  @end menu
  
  @node Subprocess Creation
***************
*** 2015,2020 ****
--- 2016,2422 ----
  @code{make-network-process} and @code{set-network-process-option}.
  @end table
  
+ @node Byte Packing
+ @section Packing and Unpacking Byte Arrays
+ 
+   This section describes how to pack and unpack arrays of bytes,
+ usually for binary network protocols.  These functoins byte arrays to
+ alists, and vice versa.  The byte array can be represented as a
+ unibyte string or as a vector of integers, while the alist associates
+ symbols either with fixed-size objects or with recursive sub-alists.
+ 
+ @cindex serializing
+ @cindex deserializing
+ @cindex packing
+ @cindex unpacking
+   Conversion from byte arrays to nested alists is also known as
+ @dfn{deserializing} or @dfn{unpacking}, while going in the opposite
+ direction is also known as @dfn{serializing} or @dfn{packing}.
+ 
+ @menu
+ * Bindat Spec::         Describing data layout.
+ * Bindat Functions::    Doing the unpacking and packing.
+ * Bindat Examples::     Samples of what bindat.el can do for you!
+ @end menu
+ 
+ @node Bindat Spec
+ @subsection Describing Data Layout
+ 
+   To control unpacking and packing, you write a @dfn{data layout
+ specification}, a special nested list describing named and typed
+ @dfn{fields}.  This specification conrtols length of each field to be
+ processed, and how to pack or unpack it.
+ 
+ @cindex endianness
+ @cindex big endian
+ @cindex little endian
+ @cindex network byte ordering
+   A field's @dfn{type} describes the size (in bytes) of the object
+ that the field represents and, in the case of multibyte fields, how
+ the bytes are ordered within the firld.  The two possible orderings
+ are ``big endian'' (also known as ``network byte ordering'') and
+ ``little endian''.  For instance, the number @code{#x23cd} (decimal
+ 9165) in big endian would be the two bytes @code{#x23} @code{#xcd};
+ and in little endian, @code{#xcd} @code{#x23}.  Here are the possible
+ type values:
+ 
+ @table @code
+ @item u8
+ @itemx byte
+ Unsigned byte, with length 1.
+ 
+ @item u16
+ @itemx word
+ @itemx short
+ Unsigned integer in network byte order, with length 2.
+ 
+ @item u24
+ Unsigned integer in network byte order, with length 3.
+ 
+ @item u32
+ @itemx dword
+ @itemx long
+ Unsigned integer in network byte order, with length 4.
+ Note: These values may be limited by Emacs' integer implementation limits.
+ 
+ @item u16r
+ @itemx u24r
+ @itemx u32r
+ Unsigned integer in little endian order, with length 2, 3 and 4, respectively.
+ 
+ @item str @var{len}
+ String of length @var{len}.
+ 
+ @item strz @var{len}
+ Zero-terminated string of length @var{len}.
+ 
+ @item vec @var{len}
+ Vector of @var{len} bytes.
+ 
+ @item ip
+ Four-byte vector representing an Internet address.  For example:
+ @code{[127 0 0 1]} for localhost.
+ 
+ @item bits @var{len}
+ List of set bits in @var{len} bytes.  The bytes are taken in big
+ endian order and the bits are numbered starting with @code{8 *
+ @var{len} @minus{} 1}} and ending with zero.  For example: @code{bits
+ 2} unpacks @code{#x28} @code{#x1c} to @code{(2 3 4 11 13)} and
+ @code{#x1c} @code{#x28} to @code{(3 5 10 11 12)}.
+ 
+ @item (eval @var{form})
+ @var{form} is a Lisp expression evaluated at the moment the field is
+ unpacked or packed.  The result of the evaluation should be one of the
+ above-listed type specifications.
+ @end table
+ 
+ A field specification generally has the form @code{([EMAIL PROTECTED]
+ @var{handler})}.  The square braces indicate that @var{name} is
+ optional.  (Don't use names that are symbols meaningful as type
+ specifications (above) or handler specifications (below), since that
+ would be ambiguous.)  @var{name} can be a symbol or the expression
+ @code{(eval @var{form})}, in which case @var{form} should evaluate to
+ a symbol.
+ 
+ @var{handler} describes how to unpack or pack the field and can be one
+ of the following:
+ 
+ @table @code
+ @item @var{type}
+ Unpack/pack this field according to the type specification @var{type}.
+ 
+ @item eval @var{form}
+ Evaluate @var{form}, a Lisp expression, for side-effect only.  If the
+ field name is specified, the value is bound to that field name.
+ @var{form} can access and update these dynamically bound variables:
+ 
+ @table @code
+ @item raw-data
+ The data as a byte array.
+ 
+ @item pos
+ Current position of the unpacking or packing operation.
+ 
+ @item struct
+ Alist.
+ 
+ @item last
+ Value of the last field processed.
+ @end table
+ 
+ @item fill @var{len}
+ Skip @var{len} bytes.  In packing, this leaves them unchanged,
+ which normally means they remain zero.  In unpacking, this means
+ they are ignored.
+ 
+ @item align @var{len}
+ Skip to the next multiple of @var{len} bytes.
+ 
+ @item struct @var{spec-name}
+ Process @var{spec-name} as a sub-specification.  This descrobes a
+ structure nested within another structure.
+ 
+ @item union @var{form} (@var{tag} @var{spec})@dots{}
+ @c ??? I don't see how one would actually  use this.
+ @c ??? what kind of expression would be useful for @var{form}?
+ Evaluate @var{form}, a Lisp expression, find the first @var{tag}
+ that matches it, and process its associated data layout specification
+ @var{spec}.  Matching can occur in one of three ways:
+ 
+ @itemize
+ @item
+ If a @var{tag} has the form @code{(eval @var{expr})}, evaluate
+ @var{expr} with the variable @code{tag} dynamically bound to the value
+ of @var{form}.  A [EMAIL PROTECTED] result indicates a match.
+ 
+ @item
+ @var{tag} matches if it is @code{equal} to the value of @var{form}.
+ 
+ @item
+ @var{tag} matches unconditionally if it is @code{t}.
+ @end itemize
+ 
+ @item repeat @var{count} @[EMAIL PROTECTED]
+ @var{count} may be an integer, or a list of one element naming a
+ previous field.  For correct operation, each @var{field-spec} must
+ include a name.
+ @c ??? What does it MEAN?
+ @end table
+ 
+ @node Bindat Functions
+ @subsection Functions to Unpack and Pack Bytes
+ 
+   In the following documentation, @var{spec} refers to a data layout
+ specification, @code{raw-data} to a byte array, and @var{struct} to an
+ alist representing unpacked field data.
+ 
+ @defun bindat-unpack spec raw-data &optional pos
+ This function unpacks data from the byte array @code{raw-data}
+ according to @var{spec}.  Normally this starts unpacking at the
+ beginning of the byte array, but if @var{pos} is [EMAIL PROTECTED], it
+ specifies a zero-based starting position to use instead.
+ 
+ The value is an alist or nested alist in which each element describes
+ one unpacked field.
+ @end defun
+ 
+ @defun bindat-get-field struct &rest name
+ This function selects a field's data from the nested alist
+ @var{struct}.  Usually @var{struct} was returned by
+ @code{bindat-unpack}.  If @var{name} corresponds to just one argument,
+ that means to extract a top-level field value.  Multiple @var{name}
+ arguments specify repeated lookup of sub-structures.  An integer name
+ acts as an array index.
+ 
+ For example, if @var{name} is @code{(a b 2 c)}, that means to find
+ field @code{c} in the second element of subfield @code{b} of field
+ @code{a}.  (This corresponds to @code{struct.a.b[2].c} in C.)
+ @end defun
+ 
+ @defun bindat-length spec struct
+ @c ??? I don't understand this at all -- rms
+ This function returns the length in bytes of @var{struct}, according
+ to @var{spec}.
+ @end defun
+ 
+ @defun bindat-pack spec struct &optional raw-data pos
+ This function returns a byte array packed according to @var{spec} from
+ the data in the alist @var{struct}.  Normally it creates and fills a
+ new byte array starting at the beginning.  However, if @var{raw-data}
+ is [EMAIL PROTECTED], it speciries a pre-allocated string or vector to
+ pack into.  If @var{pos} is [EMAIL PROTECTED], it specifies the starting
+ offset for packing into @code{raw-data}.
+ 
+ @c ??? Isn't this a bug?  Shoudn't it always be unibyte?
+ Note: The result is a multibyte string; use @code{string-make-unibyte}
+ on it to make it unibyte if necessary.
+ @end defun
+ 
+ @defun bindat-ip-to-string ip
+ Convert the Internet address vector @var{ip} to a string in the usual
+ dotted notation.
+ 
+ @example
+ (bindat-ip-to-string [127 0 0 1])
+      @result{} "127.0.0.1"
+ @end example
+ @end defun
+ 
+ @node Bindat Examples
+ @subsection Examples of Byte Unpacking and Packing
+ 
+   Here is a complete example of byte unpacking and packing:
+ 
+   @lisp
+ (defvar fcookie-index-spec
+   '((:version  u32)
+     (:count    u32)
+     (:longest  u32)
+     (:shortest u32)
+     (:flags    u32)
+     (:delim    u8)
+     (:ignored  fill 3)
+     (:offset   repeat (:count)
+                (:foo u32)))
+   "Description of a fortune cookie index file's contents.")
+ 
+ (defun fcookie (cookies &optional index)
+   "Display a random fortune cookie from file COOKIES.
+ Optional second arg INDEX specifies the associated index
+ filename, which is by default constructed by appending
+ \".dat\" to COOKIES.  Display cookie text in possibly
+ new buffer \"*Fortune Cookie: BASENAME*\" where BASENAME
+ is COOKIES without the directory part."
+   (interactive "fCookies file: ")
+   (let* ((info (with-temp-buffer
+                  (insert-file-contents-literally
+                   (or index (concat cookies ".dat")))
+                  (bindat-unpack fcookie-index-spec
+                                 (buffer-string))))
+          (sel (random (bindat-get-field info :count)))
+          (beg (cdar (bindat-get-field info :offset sel)))
+          (end (or (cdar (bindat-get-field info :offset (1+ sel)))
+                   (nth 7 (file-attributes cookies)))))
+     (switch-to-buffer (get-buffer-create
+                        (format "*Fortune Cookie: %s*"
+                                (file-name-nondirectory cookies))))
+     (erase-buffer)
+     (insert-file-contents-literally cookies nil beg (- end 3))))
+ 
+ (defun fcookie-create-index (cookies &optional index delim)
+   "Scan file COOKIES, and write out its index file.
+ Optional second arg INDEX specifies the index filename,
+ which is by default constructed by appending \".dat\" to
+ COOKIES.  Optional third arg DELIM specifies the unibyte
+ character which, when found on a line of its own in
+ COOKIES, indicates the border between entries."
+   (interactive "fCookies file: ")
+   (setq delim (or delim ?%))
+   (let ((delim-line (format "\n%c\n" delim))
+         (count 0)
+         (max 0)
+         min p q len offsets)
+     (unless (= 3 (string-bytes delim-line))
+       (error "Delimiter cannot be represented in one byte"))
+     (with-temp-buffer
+       (insert-file-contents-literally cookies)
+       (while (and (setq p (point))
+                   (search-forward delim-line (point-max) t)
+                   (setq len (- (point) 3 p)))
+         (setq count (1+ count)
+               max (max max len)
+               min (min (or min max) len)
+               offsets (cons (1- p) offsets))))
+     (with-temp-buffer
+       (set-buffer-multibyte nil)
+       (insert (string-make-unibyte
+                (bindat-pack
+                 fcookie-index-spec
+                 `((:version . 2)
+                   (:count . ,count)
+                   (:longest . ,max)
+                   (:shortest . ,min)
+                   (:flags . 0)
+                   (:delim . ,delim)
+                   (:offset . ,(mapcar (lambda (o)
+                                         (list (cons :foo o)))
+                                       (nreverse offsets)))))))
+       (let ((coding-system-for-write 'raw-text-unix))
+         (write-file (or index (concat cookies ".dat")))))))
+ @end lisp
+ 
+ Following is an example of defining and unpacking a complex structure.
+ Consider the following C structures:
+ 
+ @example
+ struct header @{
+     unsigned long    dest_ip;
+     unsigned long    src_ip;
+     unsigned short   dest_port;
+     unsigned short   src_port;
+ @};
+ 
+ struct data @{
+     unsigned char    type;
+     unsigned char    opcode;
+     unsigned long    length;  /* In little endian order */
+     unsigned char    id[8];   /* nul-terminated string  */
+     unsigned char    data[/* (length + 3) & ~3 */];
+ @};
+ 
+ struct packet @{
+     struct header    header;
+     unsigned char    items;
+     unsigned char    filler[3];
+     struct data      item[/* items */];
+ 
+ @};
+ @end example
+ 
+ The corresponding data layout specification:
+ 
+ @lisp
+ (setq header-spec
+       '((dest-ip   ip)
+         (src-ip    ip)
+         (dest-port u16)
+         (src-port  u16)))
+ 
+ (setq data-spec
+       '((type      u8)
+         (opcode    u8)
+         (length    u16r) ;; little endian order
+         (id        strz 8)
+         (data      vec (length))
+         (align     4)))
+ 
+ (setq packet-spec
+       '((header    struct header-spec)
+         (items     u8)
+         (fill      3)
+         (item      repeat (items)
+                    (struct data-spec))))
+ @end lisp
+ 
+ A binary data representation:
+ 
+ @lisp
+ (setq binary-data
+       [ 192 168 1 100 192 168 1 101 01 28 21 32 2 0 0 0
+         2 3 5 0 ?A ?B ?C ?D ?E ?F 0 0 1 2 3 4 5 0 0 0
+         1 4 7 0 ?B ?C ?D ?E ?F ?G 0 0 6 7 8 9 10 11 12 0 ])
+ @end lisp
+ 
+ The corresponding decoded structure:
+ 
+ @lisp
+ (setq decoded-structure (bindat-unpack packet-spec binary-data))
+      @result{}
+ ((header
+   (dest-ip   . [192 168 1 100])
+   (src-ip    . [192 168 1 101])
+   (dest-port . 284)
+   (src-port  . 5408))
+  (items . 2)
+  (item ((data . [1 2 3 4 5])
+         (id . "ABCDEF")
+         (length . 5)
+         (opcode . 3)
+         (type . 2))
+        ((data . [6 7 8 9 10 11 12])
+         (id . "BCDEFG")
+         (length . 7)
+         (opcode . 4)
+         (type . 1))))
+ @end lisp
+ 
+ Fetching data from this structure:
+ 
+ @lisp
+ (bindat-get-field decoded-structure 'item 1 'id)
+      @result{} "BCDEFG"
+ @end lisp
+ 
  @ignore
     arch-tag: ba9da253-e65f-4e7f-b727-08fba0a1df7a
  @end ignore



_______________________________________________
Emacs-diffs mailing list
Emacs-diffs@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-diffs

[Emacs-diffs] Changes to emacs/lispref/processes.texi

Reply via email to