Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings

2014-07-08 Thread Ashoke
As a follow-up question,

I found some of the varchar column types, in which the histogram_bounds are
not being surrounded in double quotes ( ) even in the default
implementation.
Ex : *c_name* column of *Customer* table

I also found histogram_bounds in which only some strings are surrounded in
double quotes and some are not.
Ex : *c_address *column of* Customer *table

Why are there such inconsistencies? How is this determined?

Thank you.


On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote:

 Hi,

 I am trying to implement a functionality that is similar to ANALYZE, but
 needs to have different values (the values will be valid and is stored in
 inp-str[][]) for MCV/Histogram Bounds in case the column under
 consideration is varchar (C Strings). I have written a function
 *dummy_update_attstats* with the following changes. Other things remain
 the same as in *update_attstats* of *~/src/backend/commands/analyze.c*


 *---*
 *{*

 * ArrayType  *arry; *
 * if (*
 *strcmp(col_type,varchar) == 0*
 * )*
 * arry = construct_array(stats-stavalues[k],*
 * stats-numvalues[k], *
 * CSTRINGOID,*
 * -2, *
 * false,*
 * 'c'); *
 * else*
 * arry = construct_array(stats-stavalues[k], *
 * stats-numvalues[k],*
 * stats-statypid[k], *
 * stats-statyplen[k],*
 * stats-statypbyval[k], *
 * stats-statypalign[k]);*
 * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }*
   ---

 and I update the hist_values in the appropriate function as:
   ---

 *if (strcmp(col_type,varchar) == 0**)*
 * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),*
 * false,*
 * -2);*
 *---*

 I tried this based on the following reference :
 http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff

 My issue is : When I use my way for strings, the MCV/histogram_bounds in
 pg_stats doesn't have double quotes ( ) surrounding string. That is,

 If normal *update_attstats* is used, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA   ,ARGENTINA,...*
 If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA,ARGENTINA,...*

 This becomes an issue if the string has ',' (commas), like for example in
 *n_comment* column of *nation* table.

 Could someone point out the problem and suggest a solution?

 Thank you.

 --
 Regards,
 Ashoke




-- 
Regards,
Ashoke


Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings

2014-07-08 Thread Ashoke
Ok, I was able to figure out that when strings contained 'spaces',
PostgreSQL appends them with double quotes.


On Tue, Jul 8, 2014 at 12:04 PM, Ashoke s.ash...@gmail.com wrote:

 As a follow-up question,

 I found some of the varchar column types, in which the histogram_bounds
 are not being surrounded in double quotes ( ) even in the default
 implementation.
 Ex : *c_name* column of *Customer* table

 I also found histogram_bounds in which only some strings are surrounded in
 double quotes and some are not.
 Ex : *c_address *column of* Customer *table

 Why are there such inconsistencies? How is this determined?

 Thank you.


 On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote:

 Hi,

 I am trying to implement a functionality that is similar to ANALYZE, but
 needs to have different values (the values will be valid and is stored in
 inp-str[][]) for MCV/Histogram Bounds in case the column under
 consideration is varchar (C Strings). I have written a function
 *dummy_update_attstats* with the following changes. Other things remain
 the same as in *update_attstats* of *~/src/backend/commands/analyze.c*


 *---*
 *{*

 * ArrayType  *arry; *
 * if (*
 *strcmp(col_type,varchar) == 0*
 * )*
 * arry = construct_array(stats-stavalues[k],*
 * stats-numvalues[k], *
 * CSTRINGOID,*
 * -2, *
 * false,*
 * 'c'); *
 * else*
 * arry = construct_array(stats-stavalues[k], *
 * stats-numvalues[k],*
 * stats-statypid[k], *
 * stats-statyplen[k],*
 * stats-statypbyval[k], *
 * stats-statypalign[k]);*
 * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }*
   ---

 and I update the hist_values in the appropriate function as:
   ---

 *if (strcmp(col_type,varchar) == 0**)*
 * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),*
 * false,*
 * -2);*
 *---*

 I tried this based on the following reference :
 http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff

 My issue is : When I use my way for strings, the MCV/histogram_bounds in
 pg_stats doesn't have double quotes ( ) surrounding string. That is,

 If normal *update_attstats* is used, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA   ,ARGENTINA,...*
 If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA,ARGENTINA,...*

 This becomes an issue if the string has ',' (commas), like for example in
 *n_comment* column of *nation* table.

 Could someone point out the problem and suggest a solution?

 Thank you.

 --
 Regards,
 Ashoke




 --
 Regards,
 Ashoke







-- 
Regards,
Ashoke


[HACKERS] Modifying update_attstats of analyze.c for C Strings

2014-07-07 Thread Ashoke
Hi,

I am trying to implement a functionality that is similar to ANALYZE, but
needs to have different values (the values will be valid and is stored in
inp-str[][]) for MCV/Histogram Bounds in case the column under
consideration is varchar (C Strings). I have written a function
*dummy_update_attstats* with the following changes. Other things remain the
same as in *update_attstats* of *~/src/backend/commands/analyze.c*


*---*
*{*

* ArrayType  *arry;*
* if (*
*strcmp(col_type,varchar) == 0*
*)*
* arry = construct_array(stats-stavalues[k],*
* stats-numvalues[k],*
* CSTRINGOID,*
* -2,*
* false,*
* 'c');*
* else*
* arry = construct_array(stats-stavalues[k],*
* stats-numvalues[k],*
* stats-statypid[k],*
* stats-statyplen[k],*
* stats-statypbyval[k],*
* stats-statypalign[k]);*
* values[i++] = PointerGetDatum(arry); /* stavaluesN */}*
  ---

and I update the hist_values in the appropriate function as:
  ---

*if (strcmp(col_type,varchar) == 0**)*
* hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),*
* false,*
* -2);*
*---*

I tried this based on the following reference :
http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff

My issue is : When I use my way for strings, the MCV/histogram_bounds in
pg_stats doesn't have double quotes ( ) surrounding string. That is,

If normal *update_attstats* is used, histogram_bounds for *TPCH
nation(n_name)* are : *ALGERIA   ,ARGENTINA,...*
If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
nation(n_name)* are : *ALGERIA,ARGENTINA,...*

This becomes an issue if the string has ',' (commas), like for example in
*n_comment* column of *nation* table.

Could someone point out the problem and suggest a solution?

Thank you.

-- 
Regards,
Ashoke


Re: [HACKERS] Issue while calling new PostgreSQL command from a Java Application

2014-07-04 Thread Ashoke
Thank you Ashutosh*.* That was the issue. But, could you please explain why
it worked from command line?


On Fri, Jul 4, 2014 at 11:49 AM, Ashutosh Bapat 
ashutosh.ba...@enterprisedb.com wrote:

 You may have to add code to copy inp_str to _copyVacuumStmt(). See how a
 character array being copied from other _copy* functions.


 On Fri, Jul 4, 2014 at 10:43 AM, Ashoke s.ash...@gmail.com wrote:

 Hi,

 --

 I have defined a new command my_command in PostgreSQL. This command
 takes the path of ANALYZE and inside analyze.c, I have a function to do
 some operations if its my_command.This command takes the input
 arguments: table name, column name and an input string.

 my_command nation (n_nationkey) 'input string';

 When I run this command from command line psql, it works as expected.
 But when I call the same command from a java application, the variable that
 stores the input string is NULL.

 I printed the value of the input string in gram.y file where I have
 defined my_command.
 fprintf (stderr, I am inside gram.y %s\n,n-inp_str); and the input
 string is printed correctly.

 But when I print stmt-inp_str in the function standard_ProcessUtility()
  of utility.c for the case T_VacuumStmt, I get the value as NULL. This
 is as far as I could trace back from analyze.c.

 I am not sure how executing the same command from an application can make
 a difference.

 gram.y content gist:
 --

 MyStmt:
 my_keyword qualified_name name_list my_inp_str
 {
 VacuumStmt *n = makeNode(VacuumStmt);
 n-options = VACOPT_ANALYZE;
 n-freeze_min_age = -1;
 n-freeze_table_age = -1;
 n-relation = $2;
 n-va_cols = $3;
 n-inp_str = $4;
 fprintf (stderr, I am inside gram.y %s\n,n-inp_str);

 $$ = (Node *)n;
 };

 char *inp_str is added to the struct VacuumStmt in parsenodes.h

 ---

 Only the newly added char *inp_str(that is different from ANALYZE) value
 is NULL. I was able to retrieve the column name from va_cols.

 Any help is appreciated. Thanks!
 --
 Regards,
 Ashoke





 --
 Best Wishes,
 Ashutosh Bapat
 EnterpriseDB Corporation
 The Postgres Database Company




-- 
Regards,
Ashoke


[HACKERS] Issue while calling new PostgreSQL command from a Java Application

2014-07-03 Thread Ashoke
Hi,

--

I have defined a new command my_command in PostgreSQL. This command takes
the path of ANALYZE and inside analyze.c, I have a function to do some
operations if its my_command.This command takes the input arguments: table
name, column name and an input string.

my_command nation (n_nationkey) 'input string';

When I run this command from command line psql, it works as expected. But
when I call the same command from a java application, the variable that
stores the input string is NULL.

I printed the value of the input string in gram.y file where I have defined
my_command.
fprintf (stderr, I am inside gram.y %s\n,n-inp_str); and the input
string is printed correctly.

But when I print stmt-inp_str in the function standard_ProcessUtility() of
utility.c for the case T_VacuumStmt, I get the value as NULL. This is as
far as I could trace back from analyze.c.

I am not sure how executing the same command from an application can make a
difference.

gram.y content gist:
--

MyStmt:
my_keyword qualified_name name_list my_inp_str
{
VacuumStmt *n = makeNode(VacuumStmt);
n-options = VACOPT_ANALYZE;
n-freeze_min_age = -1;
n-freeze_table_age = -1;
n-relation = $2;
n-va_cols = $3;
n-inp_str = $4;
fprintf (stderr, I am inside gram.y %s\n,n-inp_str);

$$ = (Node *)n;
};

char *inp_str is added to the struct VacuumStmt in parsenodes.h

---

Only the newly added char *inp_str(that is different from ANALYZE) value is
NULL. I was able to retrieve the column name from va_cols.

Any help is appreciated. Thanks!
-- 
Regards,
Ashoke


Re: [HACKERS] Postgresql XML parsing

2014-03-13 Thread Ashoke
Hi,

  Thanks for the input. I would look into JSON parsing as well, but the
requirement is XML parsing.

  There is no DTD/Schema for the XML. Is there any way I could know what
are the possible tags and their values? I am building my parser based on
the output PostgreSQL produces (hard coding the tags) and I am afraid I
would miss out on tags.

  Thank you.


On Thu, Mar 13, 2014 at 5:47 AM, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hello,

  On 03/12/2014 09:36 AM, Ashoke wrote:
   Hi,
  
  I am working on adding a functionality to PostgreSQL. I need to
 parse
  the XML format query plan (produced by PostgreSQL v9.3) and save it
 in
  a simple data structure (say C structure). I was wondering if
 ...
  The only XML parsing we have is where Postgres is built with libxml,
  in which case we use its parser. But query plan XML is delivered to a
  client (or a log file, which means more or less the same thing
  here).

 As a HACKERS' matter, explain output can be obtained from
 ExplainPrintPlan() in any format in backend. I don't know if it
 is the case though.

  If you want to parse it then it should be parsed in the client
  - that's why we provide it. Inside postgres I don't see a point in
  parsing the XML rather than handling the query plan directly.
 
  The worst possible option would be to make a hand-cut XML parser,
  either in the client or the server - XML parsing has all sorts of
  wrinkles that can bite you badly.

 I agree with it. If XML input is not essential, JSON format would
 be parsed more easily than xml. 9.3 already intrinsically has a
 JSON parser infrastructure available for the purpose.

 regards,

 --
 Kyotaro Horiguchi
 NTT Open Source Software Center




-- 
Regards,
Ashoke


[HACKERS] Postgresql XML parsing

2014-03-12 Thread Ashoke
Hi,

   I am working on adding a functionality to PostgreSQL. I need to parse
the XML format query plan (produced by PostgreSQL v9.3) and save it in a
simple data structure (say C structure). I was wondering if PostgreSQL
already had any parsing functions implemented that I can use to do the XML
parsing. This is getting difficult as I was not able to get any DTD or XML
Schema for the XML files generated by PostgreSQL.

   I found the files xpath.c/xslt_proc.c files that contain parsing related
code, but none of the functions are being called for any xml related query
I issue to the database and some of the functions in those files mention as
deprecated.

   It would be greatly helpful if someone could guide me on this.

   Thank you.

-- 
Regards,
Ashoke