Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Pavel Stehule Mon, 02 Oct 2017 03:45:01 -0700

2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp>:


> Hi, thanks for the new patch.
>
> # The patch is missing xpath_parser.h. That of the first patch was usable.
>
> At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <pavel.steh...@gmail.com>
> wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@
> mail.gmail.com>
> > Hi
> >
> > now xpath and xpath_exists supports default namespace too
>
> At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <pavel.steh...@gmail.com>
> wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail.
> gmail.com>
> > > 1. Uniformity among simliar features
> > >
> > >   As mentioned in the proposal, but it is lack of uniformity that
> > >   the xpath transformer is applied only to xmltable and not for
> > >   other xpath related functions.
> > >
> >
> > I have to fix the XPath function. The SQL/XML function Xmlexists doesn't
> > support namespaces/
>
> Sorry, I forgot to care about that. (And the definition of
> namespace array is of course fabricated by me). I'd like to leave
> this to committers.  Anyway it is working but the syntax (or
> whether it is acceptable) is still arguable.
>
> SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com";>
> test</my:a>',
>              ARRAY[ARRAY['', 'http://example.com']]);
> |  xpath
> | --------
> |  {test}
> | (1 row)
>
>
> The internal name is properly rejected, but the current internal
> name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
> are preserving some short names and reject them as
> user-defined. Doesn't just 'pgsqlxml' work?
>

LibXML2 does trim to 100 bytes length names. So
pgdefnamespace.pgsqlxml.internal
is safe from this perspective.

I would to decraese a risk of possible collision, so longer string is
better. Maybe "pgsqlxml.internal" is good enoug - I have not a idea. But if
somewhere will be this string printed, then
"pgdefnamespace.pgsqlxml.internal" has clean semantic, and it is reason,
why I prefer this string. PostgreSQL uses 63 bytes names - and this string
is correct too.


>
> Default namespace correctly become to be applied on bare
> attribute names.
>
> > updated doc,
> > fixed all variants of expected result test file
>
> Sorry for one by one comment but I found another misbehavior.
>
> create table t1 (id int, doc xml);
> insert into t1
>    values
>    (5, '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></
> rows>');
> select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x),
> '/x:rows/x:row' passing t1.doc columns data int PATH
> 'child::x:a[1][attribute::hoge="haha"]') as x;
> |  data
> | ------
> |    50
>
> but the following fails.
>
> select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
> '/rows/row' passing t1.doc columns data int PATH
> 'child::a[1][attribute::hoge="haha"]') as x;
> |  data
> | ------
> |
> | (1 row)
>
> Perhaps child::a is not prefixed by the transformation.
>
> XPath might be complex enough so that it's worth switching to
> yacc/lex based transformer that is formally verifiable and won't
> need a bunch of cryptic tests that finally cannot prove the
> completeness. synchronous_standy_names is far simpler than XPath
> but using yacc/lex parser.
>

I don't think (not yet) - it is simple state machine now, and when the code
will be stable, then will not be modified.

Thank you for comments, I'll look on it

Regards

Pavel


>
> Anyway the following is nitpicking of the current xpath_parser.c.
>
> - NODENAME_FIRSTCHAR allows '-' as the first char but it is
>   excluded from NameStartChar (https://www.w3.org/TR/REC-
> xml/#NT-NameStartChar)
>   I think characters with high-bit set is okay.
>   Also IS_NODENAME_CHAR should be changed.
>
> - NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category
>   but have different naming schemes. Can these are named in the same way?
>
> - The current transoformer seems to using up to one token stack
>   depth. Maybe the stack is needless. (pushed token is always
>   popped just after)
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
>

Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Reply via email to