Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings
As a follow-up question, I found some of the varchar column types, in which the histogram_bounds are not being surrounded in double quotes ( ) even in the default implementation. Ex : *c_name* column of *Customer* table I also found histogram_bounds in which only some strings are surrounded in double quotes and some are not. Ex : *c_address *column of* Customer *table Why are there such inconsistencies? How is this determined? Thank you. On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote: Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry; * * if (* *strcmp(col_type,varchar) == 0* * )* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k], * * CSTRINGOID,* * -2, * * false,* * 'c'); * * else* * arry = construct_array(stats-stavalues[k], * * stats-numvalues[k],* * stats-statypid[k], * * stats-statyplen[k],* * stats-statypbyval[k], * * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke -- Regards, Ashoke
Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings
Ok, I was able to figure out that when strings contained 'spaces', PostgreSQL appends them with double quotes. On Tue, Jul 8, 2014 at 12:04 PM, Ashoke s.ash...@gmail.com wrote: As a follow-up question, I found some of the varchar column types, in which the histogram_bounds are not being surrounded in double quotes ( ) even in the default implementation. Ex : *c_name* column of *Customer* table I also found histogram_bounds in which only some strings are surrounded in double quotes and some are not. Ex : *c_address *column of* Customer *table Why are there such inconsistencies? How is this determined? Thank you. On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote: Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry; * * if (* *strcmp(col_type,varchar) == 0* * )* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k], * * CSTRINGOID,* * -2, * * false,* * 'c'); * * else* * arry = construct_array(stats-stavalues[k], * * stats-numvalues[k],* * stats-statypid[k], * * stats-statyplen[k],* * stats-statypbyval[k], * * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke -- Regards, Ashoke -- Regards, Ashoke
[HACKERS] Modifying update_attstats of analyze.c for C Strings
Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry;* * if (* *strcmp(col_type,varchar) == 0* *)* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k],* * CSTRINGOID,* * -2,* * false,* * 'c');* * else* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k],* * stats-statypid[k],* * stats-statyplen[k],* * stats-statypbyval[k],* * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */}* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke
Re: [HACKERS] Issue while calling new PostgreSQL command from a Java Application
Thank you Ashutosh*.* That was the issue. But, could you please explain why it worked from command line? On Fri, Jul 4, 2014 at 11:49 AM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: You may have to add code to copy inp_str to _copyVacuumStmt(). See how a character array being copied from other _copy* functions. On Fri, Jul 4, 2014 at 10:43 AM, Ashoke s.ash...@gmail.com wrote: Hi, -- I have defined a new command my_command in PostgreSQL. This command takes the path of ANALYZE and inside analyze.c, I have a function to do some operations if its my_command.This command takes the input arguments: table name, column name and an input string. my_command nation (n_nationkey) 'input string'; When I run this command from command line psql, it works as expected. But when I call the same command from a java application, the variable that stores the input string is NULL. I printed the value of the input string in gram.y file where I have defined my_command. fprintf (stderr, I am inside gram.y %s\n,n-inp_str); and the input string is printed correctly. But when I print stmt-inp_str in the function standard_ProcessUtility() of utility.c for the case T_VacuumStmt, I get the value as NULL. This is as far as I could trace back from analyze.c. I am not sure how executing the same command from an application can make a difference. gram.y content gist: -- MyStmt: my_keyword qualified_name name_list my_inp_str { VacuumStmt *n = makeNode(VacuumStmt); n-options = VACOPT_ANALYZE; n-freeze_min_age = -1; n-freeze_table_age = -1; n-relation = $2; n-va_cols = $3; n-inp_str = $4; fprintf (stderr, I am inside gram.y %s\n,n-inp_str); $$ = (Node *)n; }; char *inp_str is added to the struct VacuumStmt in parsenodes.h --- Only the newly added char *inp_str(that is different from ANALYZE) value is NULL. I was able to retrieve the column name from va_cols. Any help is appreciated. Thanks! -- Regards, Ashoke -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Regards, Ashoke
[HACKERS] Issue while calling new PostgreSQL command from a Java Application
Hi, -- I have defined a new command my_command in PostgreSQL. This command takes the path of ANALYZE and inside analyze.c, I have a function to do some operations if its my_command.This command takes the input arguments: table name, column name and an input string. my_command nation (n_nationkey) 'input string'; When I run this command from command line psql, it works as expected. But when I call the same command from a java application, the variable that stores the input string is NULL. I printed the value of the input string in gram.y file where I have defined my_command. fprintf (stderr, I am inside gram.y %s\n,n-inp_str); and the input string is printed correctly. But when I print stmt-inp_str in the function standard_ProcessUtility() of utility.c for the case T_VacuumStmt, I get the value as NULL. This is as far as I could trace back from analyze.c. I am not sure how executing the same command from an application can make a difference. gram.y content gist: -- MyStmt: my_keyword qualified_name name_list my_inp_str { VacuumStmt *n = makeNode(VacuumStmt); n-options = VACOPT_ANALYZE; n-freeze_min_age = -1; n-freeze_table_age = -1; n-relation = $2; n-va_cols = $3; n-inp_str = $4; fprintf (stderr, I am inside gram.y %s\n,n-inp_str); $$ = (Node *)n; }; char *inp_str is added to the struct VacuumStmt in parsenodes.h --- Only the newly added char *inp_str(that is different from ANALYZE) value is NULL. I was able to retrieve the column name from va_cols. Any help is appreciated. Thanks! -- Regards, Ashoke
Re: [HACKERS] Postgresql XML parsing
Hi, Thanks for the input. I would look into JSON parsing as well, but the requirement is XML parsing. There is no DTD/Schema for the XML. Is there any way I could know what are the possible tags and their values? I am building my parser based on the output PostgreSQL produces (hard coding the tags) and I am afraid I would miss out on tags. Thank you. On Thu, Mar 13, 2014 at 5:47 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, On 03/12/2014 09:36 AM, Ashoke wrote: Hi, I am working on adding a functionality to PostgreSQL. I need to parse the XML format query plan (produced by PostgreSQL v9.3) and save it in a simple data structure (say C structure). I was wondering if ... The only XML parsing we have is where Postgres is built with libxml, in which case we use its parser. But query plan XML is delivered to a client (or a log file, which means more or less the same thing here). As a HACKERS' matter, explain output can be obtained from ExplainPrintPlan() in any format in backend. I don't know if it is the case though. If you want to parse it then it should be parsed in the client - that's why we provide it. Inside postgres I don't see a point in parsing the XML rather than handling the query plan directly. The worst possible option would be to make a hand-cut XML parser, either in the client or the server - XML parsing has all sorts of wrinkles that can bite you badly. I agree with it. If XML input is not essential, JSON format would be parsed more easily than xml. 9.3 already intrinsically has a JSON parser infrastructure available for the purpose. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Regards, Ashoke
[HACKERS] Postgresql XML parsing
Hi, I am working on adding a functionality to PostgreSQL. I need to parse the XML format query plan (produced by PostgreSQL v9.3) and save it in a simple data structure (say C structure). I was wondering if PostgreSQL already had any parsing functions implemented that I can use to do the XML parsing. This is getting difficult as I was not able to get any DTD or XML Schema for the XML files generated by PostgreSQL. I found the files xpath.c/xslt_proc.c files that contain parsing related code, but none of the functions are being called for any xml related query I issue to the database and some of the functions in those files mention as deprecated. It would be greatly helpful if someone could guide me on this. Thank you. -- Regards, Ashoke