On Wed, Sep 2, 2015 at 6:58 PM, Nikolay Shaplov <n.shap...@postgrespro.ru> wrote:
> В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал: > > Nikolay Shaplov wrote: > > > This patch adds several new functions, available from SQL queries. All > > > these functions are based on heap_page_items, but accept slightly > > > different arguments and has one additional column at the result set: > > > > > > heap_page_tuples - accepts relation name, and bulkno, and returns usual > > > heap_page_items set with additional column that contain snapshot of > tuple > > > data area stored in bytea. > > > > I think the API around get_raw_page is more useful generally. You might > > have table blocks stored in a bytea column for instance, because you > > extracted from some remote server and inserted into a local table for > > examination. If you only accept relname/blkno, there's no way to > > examine data other than blocks directly in the database dir, which is > > limiting. > > > > > There is also one strange function: _heap_page_items it is useless for > > > practical purposes. As heap_page_items it accepts page data as bytea, > but > > > also get a relation name. It tries to apply tuple descriptor of that > > > relation to that page data. > > > > For BRIN, I added something similar (brin_page_items), but it receives > > the index OID rather than name, and constructs a tuple descriptor to > > read the data. I think OID is better than name generally. (You can > > cast the relation name to regclass). > > > > It's easy to misuse, but these functions are superuser-only, so there > > should be no security issue at least. The possibility of a crash > > remains real, though, so maybe we should try to come up with a way to > > harden that. > > I've done as you've said: Now patch adds two functions heap_page_item_attrs > and heap_page_item_detoast_attrs and these function takes raw_page and > relation OID as arguments. They also have third optional parameter that > allows > to change error mode from ERROR to WARNING. > > Is it ok now? > Yeah, I think that's acceptable to have a switch, defaulting to ERROR if caller specifies nothing. Here are some observations after looking at the code, no functional testing. + int error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2) :NULL; When handling additional arguments, it seems to me that it is more readable to use something like that: if (PG_NARGS >= 3) { arg = PG_GETARG_XXX(2); //etc. } + error_mode = error_mode?WARNING:ERROR; This generates warnings at compilation. + if (SRF_IS_FIRSTCALL() && (error_mode == WARNING)) + { + ereport(WARNING, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + (errmsg("Runnung heap_page_item_attrs in WARNING mode.")))); + } I am not convinced that this is necessary. + inter_call_data->raw_page_size = VARSIZE(raw_page) - VARHDRSZ; + if (inter_call_data->raw_page_size < SizeOfPageHeaderData) Including raw_page_size in the status data does not seem necessary to me. The page data is already available for each loop so you could just extract it from it. + ereport(inter_call_data->error_level, + (errcode(ERRCODE_DATA_CORRUPTED), + errmsg("Data corruption: number of attributes in tuple header is greater than number of attributes in tuple descripor"))); I'd rather switch the error reports related to data corruption from ereport to elog, which is more suited for internal errors, and it seems to me that it is the case here. heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (!is_null) ^~~~~~~~ heapfuncs.c:419:43: note: uninitialized use occurs here raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null, BYTEAOID, fctx->multi_call_memory_ctx); ^~~~~~~~ heapfuncs.c:370:3: note: remove the 'if' if its condition is always true if (!is_null) ^~~~~~~~~~~~~ heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence this warning Datum raw_attr; My compiler complains about uninitialized variables. +-- +-- _heap_page_items() +-- +CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text, I am not sure why this is necessary and visibly it overlaps with the existing declaration of heap_page_items. The last argument is different though, and I recall that we decided not to use anymore the relation name specified as text, but its OID instead in this module. Documentation is missing, that would be good to have to understand what each function is intended to do. Code has some whitespaces. -- Michael