On Wed, Jan 23, 2013 at 5:09 PM, Alex Shinn alexsh...@gmail.com wrote:
On Wed, Jan 23, 2013 at 3:45 PM, Ivan Raikov ivan.g.rai...@gmail.comwrote:
Yes, I ran into this when I was adding UTF-8 support to mbox... If you
were to add wide char support in srfi-14, is there a way to quantify the
On Wed, Jan 23, 2013 at 03:29:01PM +0900, Ivan Raikov wrote:
Hi Peter,
I think uri-generic does not silently mangle input upon receiving UTF-8,
it just returns #f.
When parsing, yes. I think this should stay the way it is (see below).
What I was referring to here was the example in my
Hi Peter,
I think uri-generic does not silently mangle input upon receiving UTF-8,
it just returns #f. I think it is not a bad idea to raise an exception
instead.
I have not yet had the chance to thoroughly test the UTF-8 mapping
constructor, but will try to do this during the weekend.
On Thu, Jan 17, 2013 at 4:51 AM, Peter Bex peter@xs4all.nl wrote:
On Tue, Jan 15, 2013 at 02:44:08PM +0900, Alex Shinn wrote:
This result looks broken. As I noted in my previous mail, the URI
representation already handles non-ASCII characters and escapes on
output:
$ csi -R
On Thu, Jan 17, 2013 at 09:35:36AM +0900, Ivan Raikov wrote:
Hi Peter,
I think that allowing raw UTF-8 sequences in uri-generic breaks
compatibility with RFC 3986. In other words, if you construct a URI with a
UTF-8 sequence that happens to include reserved ASCII characters, those
ASCII
On Wed, Jan 16, 2013 at 11:22:57AM +0900, Alex Shinn wrote:
Anyway, this isn't really important. I'm mostly concerned
with making utf8 do the right thing, and was wondering what
the API was because it's not clear from the docs.
OK, I think it's worth figuring this out.
Put another way, do
On Tue, Jan 15, 2013 at 02:44:08PM +0900, Alex Shinn wrote:
This result looks broken. As I noted in my previous mail, the URI
representation already handles non-ASCII characters and escapes on output:
$ csi -R uri-common
#;1 (make-uri scheme: http host: 127.0.0.1 path: '(/ 삼계탕))
Hi Peter,
I think that allowing raw UTF-8 sequences in uri-generic breaks
compatibility with RFC 3986. In other words, if you construct a URI with a
UTF-8 sequence that happens to include reserved ASCII characters, those
ASCII characters will not get escaped, and you could potentially be
On Tue, Jan 15, 2013 at 3:03 PM, Ivan Raikov ivan.g.rai...@gmail.comwrote:
Percent-encoded sequences of more than one octet will not get touched by
pct-decode in the current implementation, so you will not get double
escaping. Percent-encoded sequences of one octet will get decoded if they
On Tue, Jan 15, 2013 at 06:07:06PM +0900, Alex Shinn wrote:
On Tue, Jan 15, 2013 at 3:03 PM, Ivan Raikov ivan.g.rai...@gmail.comwrote:
Percent-encoded sequences of more than one octet will not get touched by
pct-decode in the current implementation, so you will not get double
escaping.
On Tue, Jan 15, 2013 at 07:30:07PM +0900, Alex Shinn wrote:
Right, I'm familiar with the evil standards :) I'm also hoping that we can
have some basic compatibility between Chicken's uri module and Chibi's
(and whatever R7RS WG2 comes up with).
That would be nice indeed.
It seems to me the
On Tue, Jan 15, 2013 at 7:48 PM, Peter Bex peter@xs4all.nl wrote:
These special characters are called reserved in the BNF. As you can
see, the question mark, equals sign and ampersand is in there.
For query urlencoded query strings, these *cannot* be decoded, because
then you can't
On Wed, Jan 16, 2013 at 12:39:16AM +0900, Alex Shinn wrote:
The internal representation is either decoded, or it is encoded.
Either can be made to work.
In this case, the decoded uri-common representation of the former is:
((bool-expr . xy=1))
and the decoded representation of the
On Wed, Jan 16, 2013 at 12:59 AM, Peter Bex peter@xs4all.nl wrote:
On Wed, Jan 16, 2013 at 12:39:16AM +0900, Alex Shinn wrote:
The internal representation is either decoded, or it is encoded.
Either can be made to work.
In this case, the decoded uri-common representation of the
On Mon, Jan 14, 2013 at 02:42:40PM +0900, Alex Shinn wrote:
On Mon, Jan 14, 2013 at 1:36 PM, Sungjin Chun chu...@gmail.com wrote:
As far as I know, revised RFC permits UTF-8 characters in the URL without
encoding. Am I wrong here?
Thus you can't use raw non-ASCII bytes in a URI - they must
On Mon, Jan 14, 2013 at 09:18:52AM +0100, Peter Bex wrote:
On Mon, Jan 14, 2013 at 02:42:40PM +0900, Alex Shinn wrote:
On Mon, Jan 14, 2013 at 1:36 PM, Sungjin Chun chu...@gmail.com wrote:
As far as I know, revised RFC permits UTF-8 characters in the URL without
encoding. Am I wrong here?
Thank you very much. :-)
My proposed hack(yes, no solution) just works for me but I found that it is
just wrong w.r.t RFC.
I'll try your modification and and let you know whether it works or not.
Thank you again.
On Mon, Jan 14, 2013 at 5:08 PM, Ivan Raikov ivan.g.rai...@gmail.comwrote:
Hi
On Tue, Jan 15, 2013 at 7:35 AM, Sungjin Chun chu...@gmail.com wrote:
Thank you very much. :-)
My proposed hack(yes, no solution) just works for me but I found that it
is just wrong w.r.t RFC.
I'll try your modification and and let you know whether it works or not.
Thank you again.
On
My intention is to create search client for Solr (search server using
lucene); where I should send
request URL like this;
http://127.0.0.1:8983/solr/select?q=삼계탕start=0rows=10
I've tried to create this client using http-client egg and had found that
it does not like UTF-8 characters
in the
On Tue, Jan 15, 2013 at 11:50 AM, Sungjin Chun chu...@gmail.com wrote:
My intention is to create search client for Solr (search server using
lucene); where I should send
request URL like this;
http://127.0.0.1:8983/solr/select?q=삼계탕start=0rows=10
I've tried to create this client using
Hi all,
I realized that I replied only to Sungjin and neglected to include the
mailing list, so let me repeat.
Section 3.1 of RFC 3987 defines a mapping between IRIs and URIs such that
UTF-8 sequences are percent-encoded.
So I implemented a procedure iri-uri, which percent-encodes a UTF-8
Hi again,
I have now extended the utf8 code in uri-generic, so that UTF-8
sequences are percent-encoded as lists of the form '(% h1 h2 [% h3 h4
...])). The percent-decoding routine is not going to decode sequences of
more that one byte, so that now percent encoding normalization will not
On Tue, Jan 15, 2013 at 2:23 PM, Ivan Raikov ivan.g.rai...@gmail.comwrote:
Hi again,
I have now extended the utf8 code in uri-generic, so that UTF-8
sequences are percent-encoded as lists of the form '(% h1 h2 [% h3 h4
...])). The percent-decoding routine is not going to decode sequences
Hi Alex,
I understand your point about make-uri, but I want to provide a uri
constructor that takes a UTF-8 input string and maps it in accordance with
RFC 3986 / 3987.
So we still have to perform path and percent-encoding normalization steps
for the ASCII portions of the string. make-uri
Oops, the second example should have been
For the string 삼계탕 the octets are EC 82 BC EA B3 84 ED 83 95 and
(utf8-string-uri http://example.com/삼계탕;) produces
#(URI scheme=http authority=#(URIAuth host=example.com port=#f) path=(/
%EC%82%BC%EA%B3%84%ED%83%95) query=#f fragment=#f)
Sorry about
For testing solr, lucene based client, I have to create url which contains
utf-8 encoding(for Korean). But having this encoding uri-common cannot create
uri.
Can any one help me on this? Thanks.
Sent from my iPhone
___
Chicken-users mailing list
On Mon, Jan 14, 2013 at 07:04:05AM +0900, Sungjin Chun wrote:
For testing solr, lucene based client, I have to create url which contains
utf-8 encoding(for Korean). But having this encoding uri-common cannot create
uri.
Can any one help me on this? Thanks.
Hello Sungjin,
As far as I
Though I'm not that fluent in scheme, I'll try to make test case for
uri-generic with UTF-8 string.
Thanks.
On Mon, Jan 14, 2013 at 7:15 AM, Peter Bex peter@xs4all.nl wrote:
On Mon, Jan 14, 2013 at 07:04:05AM +0900, Sungjin Chun wrote:
For testing solr, lucene based client, I have to
First, I might have found wrong place but...
It seems that the main source of the my problem is related to the part of
uri-generic.scm, especially;
(define char-set:uri-unreserved
(char-set union char-set:letter+digit (string-char-set -_.~)))
If I change this part as;
(define
As far as I know, revised RFC permits UTF-8 characters in the URL without
encoding. Am I wrong here?
Even Solr (the search engine) permits them.
On Mon, Jan 14, 2013 at 1:26 PM, Alex Shinn alexsh...@gmail.com wrote:
Hi,
On Mon, Jan 14, 2013 at 12:52 PM, Sungjin Chun chu...@gmail.com wrote:
On Mon, Jan 14, 2013 at 1:36 PM, Sungjin Chun chu...@gmail.com wrote:
As far as I know, revised RFC permits UTF-8 characters in the URL without
encoding. Am I wrong here?
The latest URI RFC is 3986. The relevant description in prose is:
Local names, such as file system names, are stored
31 matches
Mail list logo