I'll enter a ticket and see where it goes. Thanks! -Bob
On Wed, Oct 16, 2013 at 1:27 AM, Geert Josten <[email protected]> wrote: > Hi Bob, > > > > Support can tell you best if it is known or not. I’m guessing not, first > time I saw it mentioned on this list at least. Not sure whether the spec > requires support for these encoded-words, and what the behavior should be > in case not, but ML experts will know.. > > > > Kind regards, > > Geert > > > > *Van:* [email protected] [mailto: > [email protected]] *Namens *Bob H. > *Verzonden:* dinsdag 15 oktober 2013 21:38 > *Aan:* general > *Onderwerp:* Re: [MarkLogic Dev General] XDMP-ENCODING error using > xdmp:get-request-header() > > > > My apologies for replying to my own thread ... but I've done some further > digging on this. I think I've convinced myself that there's a bug in the > handling of this sort of character encoding in HTTP header values. > > Excerpts RFC 2047, which governs this sort of encoding: > > Instead, certain sequences of "ordinary" printable ASCII characters > > (known as "encoded-words") are reserved for use as encoded data. The > > syntax of encoded-words is such that they are unlikely to > > "accidentally" appear as normal text in message headers. > > Furthermore, the characters used in encoded-words are restricted to > > those which do not have special meanings in the context in which the > > encoded-word appears. > > > > Generally, an "encoded-word" is a sequence of printable ASCII > > characters that begins with "=?", ends with "?=", and has two "?"s in > > between. It specifies a character set and an encoding method, and > > also includes the original text encoded as graphic ASCII characters, > > according to the rules for that encoding method. > > > > The RFC goes on to specify the ABNF grammar that defines an "encoded-word" > ... and none of my test samples match the grammar. As such, it seems to me > that ML should not be attempting to interpret these values as encoded > characters. > > The full RFC can be found here: > http://tools.ietf.org/html/rfc2047 > > Further, if I use a properly encoded value, the entire encoded section of > the header value is simply omitted when I read the value using > xdmp:get-request-header(). > > I plan to enter a support ticket for this, unless it's already a known bug? > > -Bob > > > > On Tue, Oct 15, 2013 at 2:32 PM, Bob H. <[email protected]> wrote: > > I'm seeing what I consider to be odd behavior when getting HTTP request > header values via xdmp:get-request-header(). If the header value contains > the string "=?", those characters are missing when I get the header value. > If the header value contains an additional "?" character after the initial > "=?", then an XDMP-ENCODING error is thrown. I think I understand why this > is happening, but not how to avoid it. > > I wrote a module that simply returns the HTTP header values that were > provided on the request, and called the module via an HTTP app server. > > Here's the code: > for $name in xdmp:get-request-header-names() > return > fn:concat($name, ': ', xdmp:get-request-header($name) ) > > When I call the module with a header value containing "=?", it echos the > value back with the "=?" stripped out. For example: > > TestHeader1: Test=?Value > => TestHeader1: TestValue > > When I call the module with a header value containing "=?" and another "?" > anywhere in the remainder of the header value, I get nothing back and see > an XDMP-ENCODING error in the ErrorLog: > > TestHeader1: Test=?Value? > => (no response) > > > Error: AppRequestTask::run: XDMP-ENCODING: (err:XQST0087) Unsupported > character encoding: Value > > I'm guessing this is because technically one is allowed to specify the > character encoding of an HTTP header value via MIME encoding, as described > here: > > http://stackoverflow.com/questions/4400678/http-header-should-use-what-character-encoding > > All of that brings me to my question ... in my case, the header value is > not encoded in any special way, but I do want to be able to use the > characters "=" and "?" in header values. Is there a way to disable this > character encoding processing (as I have no need to use alternate encodings > in my header values at this time) ... or some other way to work around this > behavior? > > Thanks! > > -Bob > > -- > > Bob Hodgeman > Consulting Engineer > > LexisNexis > > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
