I'll enter a ticket and see where it goes.  Thanks!

-Bob


On Wed, Oct 16, 2013 at 1:27 AM, Geert Josten <[email protected]> wrote:

> Hi Bob,
>
>
>
> Support can tell you best if it is known or not. I’m guessing not, first
> time I saw it mentioned on this list at least. Not sure whether the spec
> requires support for these encoded-words, and what the behavior should be
> in case not, but ML experts will know..
>
>
>
> Kind regards,
>
> Geert
>
>
>
> *Van:* [email protected] [mailto:
> [email protected]] *Namens *Bob H.
> *Verzonden:* dinsdag 15 oktober 2013 21:38
> *Aan:* general
> *Onderwerp:* Re: [MarkLogic Dev General] XDMP-ENCODING error using
> xdmp:get-request-header()
>
>
>
> My apologies for replying to my own thread ... but I've done some further
> digging on this.  I think I've convinced myself that there's a bug in the
> handling of this sort of character encoding in HTTP header values.
>
> Excerpts RFC 2047, which governs this sort of encoding:
>
> Instead, certain sequences of "ordinary" printable ASCII characters
>
>    (known as "encoded-words") are reserved for use as encoded data.  The
>
>    syntax of encoded-words is such that they are unlikely to
>
>    "accidentally" appear as normal text in message headers.
>
>    Furthermore, the characters used in encoded-words are restricted to
>
>    those which do not have special meanings in the context in which the
>
>    encoded-word appears.
>
>
>
>    Generally, an "encoded-word" is a sequence of printable ASCII
>
>    characters that begins with "=?", ends with "?=", and has two "?"s in
>
>    between.  It specifies a character set and an encoding method, and
>
>    also includes the original text encoded as graphic ASCII characters,
>
>    according to the rules for that encoding method.
>
>
>
> The RFC goes on to specify the ABNF grammar that defines an "encoded-word"
> ... and none of my test samples match the grammar.  As such, it seems to me
> that ML should not be attempting to interpret these values as encoded
> characters.
>
> The full RFC can be found here:
> http://tools.ietf.org/html/rfc2047
>
> Further, if I use a properly encoded value, the entire encoded section of
> the header value is simply omitted when I read the value using
> xdmp:get-request-header().
>
> I plan to enter a support ticket for this, unless it's already a known bug?
>
> -Bob
>
>
>
> On Tue, Oct 15, 2013 at 2:32 PM, Bob H. <[email protected]> wrote:
>
> I'm seeing what I consider to be odd behavior when getting HTTP request
> header values via xdmp:get-request-header().  If the header value contains
> the string "=?", those characters are missing when I get the header value.
> If the header value contains an additional "?" character after the initial
> "=?", then an XDMP-ENCODING error is thrown.  I think I understand why this
> is happening, but not how to avoid it.
>
> I wrote a module that simply returns the HTTP header values that were
> provided on the request, and called the module via an HTTP app server.
>
> Here's the code:
> for $name in xdmp:get-request-header-names()
> return
>     fn:concat($name, ': ', xdmp:get-request-header($name) )
>
> When I call the module with a header value containing "=?", it echos the
> value back with the "=?" stripped out.  For example:
>
> TestHeader1: Test=?Value
> => TestHeader1: TestValue
>
> When I call the module with a header value containing "=?" and another "?"
> anywhere in the remainder of the header value, I get nothing back and see
> an XDMP-ENCODING error in the ErrorLog:
>
> TestHeader1: Test=?Value?
> => (no response)
>
>
> Error: AppRequestTask::run: XDMP-ENCODING: (err:XQST0087) Unsupported
> character encoding: Value
>
> I'm guessing this is because technically one is allowed to specify the
> character encoding of an HTTP header value via MIME encoding, as described
> here:
>
> http://stackoverflow.com/questions/4400678/http-header-should-use-what-character-encoding
>
> All of that brings me to my question ... in my case, the header value is
> not encoded in any special way, but I do want to be able to use the
> characters "=" and "?" in header values.  Is there a way to disable this
> character encoding processing (as I have no need to use alternate encodings
> in my header values at this time) ... or some other way to work around this
> behavior?
>
> Thanks!
>
> -Bob
>
> --
>
> Bob Hodgeman
> Consulting Engineer
>
> LexisNexis
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to