Thank you for the new patch. - The latest patch is missing xpath_parser.h at least since ns-3. That of the first (not-numbered) version was still usable.
- c29c578 conflicts on doc/src/sgml/func.sgml At Sun, 15 Oct 2017 12:06:11 +0200, Pavel Stehule <pavel.steh...@gmail.com> wrote in <cafj8prcybh+a6ojoeyufdupbq1yswtt2cfnfzxs2ab9efon...@mail.gmail.com> > 2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI < > horiguchi.kyot...@lab.ntt.co.jp>: > > > Hi, thanks for the new patch. > > > > # The patch is missing xpath_parser.h. That of the first patch was usable. > > > > At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <pavel.steh...@gmail.com> > > wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@ > > mail.gmail.com> > > > Hi > > > > > > now xpath and xpath_exists supports default namespace too > > > > At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <pavel.steh...@gmail.com> > > wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail. > > gmail.com> > > > > 1. Uniformity among simliar features > > > > > > > > As mentioned in the proposal, but it is lack of uniformity that > > > > the xpath transformer is applied only to xmltable and not for > > > > other xpath related functions. > > > > > > > > > > I have to fix the XPath function. The SQL/XML function Xmlexists doesn't > > > support namespaces/ > > > > Sorry, I forgot to care about that. (And the definition of > > namespace array is of course fabricated by me). I'd like to leave > > this to committers. Anyway it is working but the syntax (or > > whether it is acceptable) is still arguable. > > > > SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com"> > > test</my:a>', > > ARRAY[ARRAY['', 'http://example.com']]); > > | xpath > > | -------- > > | {test} > > | (1 row) > > > > > > The internal name is properly rejected, but the current internal > > name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We > > are preserving some short names and reject them as > > user-defined. Doesn't just 'pgsqlxml' work? > > > > > > Default namespace correctly become to be applied on bare > > attribute names. > > > > > updated doc, > > > fixed all variants of expected result test file > > > > Sorry for one by one comment but I found another misbehavior. > > > > create table t1 (id int, doc xml); > > insert into t1 > > values > > (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></ > > rows>'); > > select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x), > > '/x:rows/x:row' passing t1.doc columns data int PATH > > 'child::x:a[1][attribute::hoge="haha"]') as x; > > | data > > | ------ > > | 50 > > > > but the following fails. > > > > select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), > > '/rows/row' passing t1.doc columns data int PATH > > 'child::a[1][attribute::hoge="haha"]') as x; > > | data > > | ------ > > | > > | (1 row) > > > > Perhaps child::a is not prefixed by the transformation. > > > > the problem was in unwanted attribute modification. The parser didn't > detect "attribute::hoge" as attribute. Updated parser does it. I reduce > duplicated code there more. It worked as expected. But the comparison of "attribute" is missing t1.length = 9 so the following expression wrongly passes. child::a[1][attributeabcdefg::hoge="haha" It is confusing that is_qual_name becomes true when t2 is not a "qual name", and the way it treats a double-colon is hard to understand. It essentially does inserting the default namespace before unqualified non-attribute name. I believe we can easily look-ahead to detect a double colon and it would make things simpler. Could you consider something like the attached patch? (applies on top of ns-4 patch.) > > XPath might be complex enough so that it's worth switching to > > yacc/lex based transformer that is formally verifiable and won't > > need a bunch of cryptic tests that finally cannot prove the > > completeness. synchronous_standy_names is far simpler than XPath > > but using yacc/lex parser. > > > > > > Anyway the following is nitpicking of the current xpath_parser.c. > > > > - NODENAME_FIRSTCHAR allows '-' as the first char but it is > > excluded from NameStartChar (https://www.w3.org/TR/REC- > > xml/#NT-NameStartChar) > > I think characters with high-bit set is okay. > > Also IS_NODENAME_CHAR should be changed. > > > > fixed > > > > - NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category > > but have different naming schemes. Can these are named in the same way? > > > > fixed > > > > - The current transoformer seems to using up to one token stack > > depth. Maybe the stack is needless. (pushed token is always > > popped just after) > > > > fixed Thank you. I found another (and should be the last, so sorry..) functional defect in this. This doesn't add default namespace if the tag name in a predicate is 'and' or 'or'. It needs to be fixed, or wrote in the documentation as a restriction. (seem hard to fix it..) create table t1 (id int, doc xml); insert into t1 values (1, '<rows xmlns="http://x.y"><row><val>50</val><and>60</and></row></rows>'); select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' passing t1.doc columns data int PATH 'x:val[../x:and = 60]') as x; data ------ 50 (1 row) select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' passing t1.doc columns data int PATH 'val[../and = 60]') as x; data ------ (1 row) Other comments are follows. - Please add more comments. XPATH_TOKEN_NAME in _transformXPath in my patch has more - Debug output might be needed. # sorry now time's up. will continue tomorrow. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c index f22ec76..1d8d93c 100644 --- a/src/backend/utils/adt/xpath_parser.c +++ b/src/backend/utils/adt/xpath_parser.c @@ -41,6 +41,8 @@ typedef enum XPATH_TOKEN_NAME, XPATH_TOKEN_STRING, XPATH_TOKEN_NUMBER, + XPATH_TOKEN_COLON, + XPATH_TOKEN_DCOLON, XPATH_TOKEN_OTHER } XPathTokenType; @@ -68,6 +70,7 @@ typedef struct ParserData #define IS_NODENAME_CHAR(c) (IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \ ((c) >= '0' && (c) <= '9')) +#define TOKEN_IS_EMPTY(t) ((t)->ttype == XPATH_TOKEN_NONE) /* * Returns next char after last char of token - XPath lexer @@ -112,6 +115,17 @@ getXPathToken(char *str, XPathTokenInfo * ti) ti->ttype = XPATH_TOKEN_STRING; } + else if (c == ':') + { + /* look ahead to detect a doulbe-colon */ + if (*str == ':') + { + ti->ttype = XPATH_TOKEN_DCOLON; + str++; + } + else + ti->ttype = XPATH_TOKEN_COLON; + } else ti->ttype = XPATH_TOKEN_OTHER; @@ -165,6 +179,7 @@ pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti) memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo)); parser->buffer_is_empty = false; + ti->ttype = XPATH_TOKEN_NONE; } /* @@ -179,6 +194,9 @@ writeXPathToken(StringInfo str, XPathTokenInfo * ti) appendBinaryStringInfo(str, ti->start, ti->length); else appendStringInfoChar(str, *ti->start); + + /* this token is comsumed */ + ti->ttype = XPATH_TOKEN_NONE; } /* @@ -192,7 +210,7 @@ _transformXPath(StringInfo str, XPathParserData * parser, { XPathTokenInfo t1, t2; - bool token_is_tagname = false; + bool tagname_needs_defnsp = false; bool token_is_tagattrib = false; nextXPathToken(parser, &t1); @@ -203,7 +221,11 @@ _transformXPath(StringInfo str, XPathParserData * parser, { case XPATH_TOKEN_NUMBER: case XPATH_TOKEN_STRING: - token_is_tagname = false; + /* + * string cannot be a tag name. write out it immediately and + * go ahead + */ + tagname_needs_defnsp = false; token_is_tagattrib = false; writeXPathToken(str, &t1); @@ -212,8 +234,6 @@ _transformXPath(StringInfo str, XPathParserData * parser, case XPATH_TOKEN_NAME: { - bool is_qual_name = false; - /* inside predicate ignore keywords "and" "or" */ if (inside_predicate) { @@ -226,53 +246,56 @@ _transformXPath(StringInfo str, XPathParserData * parser, } } - token_is_tagname = true; + /* look ahead what is following the name token */ + tagname_needs_defnsp = true; nextXPathToken(parser, &t2); - if (t2.ttype == XPATH_TOKEN_OTHER) + if (t2.ttype == XPATH_TOKEN_COLON) + { + /* t1 is a quilified node name. no need to add default one. */ + tagname_needs_defnsp = false; + writeXPathToken(str, &t1); /* namespace name */ + writeXPathToken(str, &t2); /* colon */ + /* get node name */ + nextXPathToken(parser, &t1); + } + else if (t2.ttype == XPATH_TOKEN_DCOLON) + { + /* t1 is an axis name. write out as it is */ + if (strncmp(t1.start, "attribute", 9) == 0 && t1.length == 9) + token_is_tagattrib = true; + + writeXPathToken(str, &t1); /* axis name */ + writeXPathToken(str, &t2); /* double colon */ + + /* + * The next token may be qualified tag name, process + * it as a fresh token. + */ + nextXPathToken(parser, &t1); + break; + } + else if (t2.ttype == XPATH_TOKEN_OTHER) { + /* function name doesn't require namespace */ if (*t2.start == '(') - token_is_tagname = false; - else if (*t2.start == ':') - { - XPathTokenInfo t3; - - nextXPathToken(parser, &t3); - if (t3.ttype == XPATH_TOKEN_OTHER && *t3.start == ':' - && strncmp(t1.start, "attribute", 9) == 0) - { - /* other syntax for attribute, where we should not apply def namespace */ - appendStringInfo(str, "attribute::"); - nextXPathToken(parser, &t1); - token_is_tagattrib = true; - break; - } - else - { - pushXPathToken(parser, &t3); - is_qual_name = true; - } - } + tagname_needs_defnsp = false; + else + pushXPathToken(parser, &t2); } - if (token_is_tagname && !token_is_tagattrib - && !is_qual_name && def_namespace_name != NULL) + if (tagname_needs_defnsp && !token_is_tagattrib && + def_namespace_name != NULL) appendStringInfo(str, "%s:", def_namespace_name); token_is_tagattrib = false; - writeXPathToken(str, &t1); + /* write maybe-tagname if not consumed */ + if (!TOKEN_IS_EMPTY(&t1)) + writeXPathToken(str, &t1); - if (is_qual_name) - { + /* output t2 if not consumed yet */ + if (!TOKEN_IS_EMPTY(&t2)) writeXPathToken(str, &t2); - nextXPathToken(parser, &t1); - if (t1.ttype == XPATH_TOKEN_NAME) - writeXPathToken(str, &t1); - else - pushXPathToken(parser, &t1); - } - else - pushXPathToken(parser, &t2); nextXPathToken(parser, &t1); } @@ -283,7 +306,6 @@ _transformXPath(StringInfo str, XPathParserData * parser, char c = *t1.start; token_is_tagattrib = false; - token_is_tagname = false; writeXPathToken(str, &t1); @@ -307,10 +329,14 @@ _transformXPath(StringInfo str, XPathParserData * parser, } break; + case XPATH_TOKEN_COLON: + case XPATH_TOKEN_DCOLON: case XPATH_TOKEN_NONE: elog(ERROR, "should not be here"); } } + + elog(LOG, "\"%s\"", str->data); } void
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers