On Sun, Sep 2, 2018 at 12:29 AM Simon Slavin <slav...@bigfraud.org> wrote:
> On 1 Sep 2018, at 11:03pm, Zsbán Ambrus <amb...@math.bme.hu> wrote:
> > Would you consider adding an aggregate function "string_agg" which is
> > a synonym for the "group_concat" function but takes only two
> > arguments?
>
> Could you look up the definition of the existing function on
>
> <https://sqlite.org/lang_aggfunc.html#groupconcat>
>
> and describe that again ?  The existing function already has a two-argument 
> form.

Thank you for your quick reply.  Yes, the group_concat function does
exactly what I'm trying to do.  It is an aggregate function that
concatenates several strings, one string computed from each row in the
same aggregate group, omitting NULL values, and adds a separator
string between each two strings.  That's why the new function I'm
proposing would be a _synonym_: it behaves exactly the same as the
group_concat function that SQLite already has, except that it might
refuse to work with just one operand.

As an example, I've recently used this function in an aggregate query
in MS SQL, which you can see at
"http://data.stackexchange.com/scifi/query/891712/"; .  The query
statement is:

    SELECT MAX(r.CreationDate) AS d,
    'site://posts/' + CAST(r.PostId AS nvarchar) + '/revisions|' +
    STRING_AGG(r.Comment, ' ; ') AS c,
    ISNULL(MAX(p.Title), MAX(q.Title)) AS t
    FROM PostHistory as r
    JOIN Posts as p ON r.PostId = p.Id
    LEFT JOIN Posts as q ON p.ParentId = q.Id
    JOIN Users AS m ON m.AccountId = ##MyAccountId?1192385##
    WHERE m.Id = p.OwnerUserId AND m.ID = r.UserId
    AND r.PostHistoryTypeId IN (4,5,6,7,8,9)
    AND EXISTS(SELECT o.Id FROM PostHistory AS o
    WHERE r.PostId = o.PostId AND o.PostHistoryTypeId IN (4,5,6,7,8,9)
    AND m.Id <> o.UserId AND o.CreationDate < r.CreationDate)
    GROUP BY r.PostId
    ORDER BY MAX(r.CreationDate) DESC;

This groups rows of the PostHistory as r table by the r.PostId field,
takes the r.Comment field from each row within a group, concatenates
it separated by a semicolon and spaces, and returns it in the c column
of the result.  The joins of p and q are used to produce the t column,
the join of m is used in a filter condition, these are not relevant
for understanding how I use the string_agg function here.

The part between double hash marks is a placeholder, which is custom
syntax of data.stackexchange.com.  The + operators here do string
concatenation, I would have to change those to || in SQLite.  But
apart from these two and the string_agg function, the rest of the
statement would probably work in SQLite unchanged if you had the same
database schema.  The SQLite query could be simplified, because you
could omit the cast from number to string and omit the two MAX calls
that produce the t column, but they don't do any harm in SQLite
either.

-- Ambrus
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to