Hi, On 2018-09-21 16:57:43 +1000, Haribabu Kommi wrote: > During the porting of Fujitsu in-memory columnar store on top of pluggable > storage, I found that the callers of the "heap_beginscan" are expecting > the returned data is always contains all the records.
Right. > For example, in the sequential scan, the heap returns the slot with > the tuple or with value array of all the columns and then the data gets > filtered and later removed the unnecessary columns with projection. > This works fine for the row based storage. For columnar storage, if > the storage knows that upper layers needs only particular columns, > then they can directly return the specified columns and there is no > need of projection step. This will help the columnar storage also > to return proper columns in a faster way. I think this is an important feature, but I feel fairly strongly that we should only tackle it in a second version. This patchset is already pretty darn large. It's imo not just helpful for columnar, but even for heap - we e.g. spend a lot of time deforming columns that are never accessed. That's particularly harmful when the leading columns are all NOT NULL and fixed width, but even if not, it's painful. > Is it good to pass the plan to the storage, so that they can find out > the columns that needs to be returned? I don't think that's the right approach - this should be a level *below* plan nodes, not reference them. I suspect we're going to have to have a new table_scan_set_columnlist() option or such. > And also if the projection can handle in the storage itself for some > scenarios, need to be informed the callers that there is no need to > perform the projection extra. I don't think that should be done in the storage layer - that's probably better done introducing custom scan nodes and such. This has costing implications etc, so this needs to happen *before* planning is finished. Greetings, Andres Freund