Ok, I was able to figure out that when strings contained 'spaces',
PostgreSQL appends them with double quotes.
On Tue, Jul 8, 2014 at 12:04 PM, Ashoke <s.ash...@gmail.com> wrote:
> As a follow-up question,
> I found some of the varchar column types, in which the histogram_bounds
> are not being surrounded in double quotes (" ") even in the default
> Ex : *c_name* column of *Customer* table
> I also found histogram_bounds in which only some strings are surrounded in
> double quotes and some are not.
> Ex : *c_address *column of* Customer *table
> Why are there such inconsistencies? How is this determined?
> Thank you.
> On Tue, Jul 8, 2014 at 10:52 AM, Ashoke <s.ash...@gmail.com> wrote:
>> I am trying to implement a functionality that is similar to ANALYZE, but
>> needs to have different values (the values will be valid and is stored in
>> inp->str) for MCV/Histogram Bounds in case the column under
>> consideration is varchar (C Strings). I have written a function
>> *dummy_update_attstats* with the following changes. Other things remain
>> the same as in *update_attstats* of *~/src/backend/commands/analyze.c*
>> * ArrayType *arry; *
>> * if (*
>> *strcmp(col_type,"varchar") == 0*
>> * )*
>> * arry = construct_array(stats->stavalues[k],*
>> * stats->numvalues[k], *
>> * CSTRINGOID,*
>> * -2, *
>> * false,*
>> * 'c'); *
>> * else*
>> * arry = construct_array(stats->stavalues[k], *
>> * stats->numvalues[k],*
>> * stats->statypid[k], *
>> * stats->statyplen[k],*
>> * stats->statypbyval[k], *
>> * stats->statypalign[k]);*
>> * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }*
>> and I update the hist_values in the appropriate function as:
>> *if (strcmp(col_type,"varchar") == 0**)*
>> * hist_values[i] = datumCopy(CStringGetDatum(inp->str[i][j]),*
>> * false,*
>> * -2);*
>> I tried this based on the following reference :
>> My issue is : When I use my way for strings, the MCV/histogram_bounds in
>> pg_stats doesn't have double quotes (" ") surrounding string. That is,
>> If normal *update_attstats* is used, histogram_bounds for *TPCH
>> nation(n_name)* are : *"ALGERIA ","ARGENTINA ",...*
>> If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
>> nation(n_name)* are : *ALGERIA,ARGENTINA,...*
>> This becomes an issue if the string has ',' (commas), like for example in
>> *n_comment* column of *nation* table.
>> Could someone point out the problem and suggest a solution?
>> Thank you.