Persisting dynamic column names+types in Phoenix is exactly what views are for.
On Wed, Dec 26, 2018 at 12:05 PM Vincent Poon <vincentp...@apache.org> wrote: > A lot of work is currently going into handling large numbers of views - > splittable syscat, view management, etc... but agree that it's not ideal. > > There's currently no built-in way to do what you want AFAIK, but you can > manage the columns yourself in a separate table: > - store them all in a single column value, and read that value before > doing your query. HBase checkAndMutate for locking. > or > - store each column as separate rows. Then you can do things like filter > by column name efficiently. > You could 'soft delete' by removing the entries. > > Would be a nice improvement to have an option to persist dynamic column > names+types in Phoenix. > > On Fri, Dec 21, 2018 at 12:18 PM Clay Baenziger (BLOOMBERG/ 731 LEX) < > cbaenzi...@bloomberg.net> wrote: > >> Hello, >> >> A user of mine brought up a question around dynamic columns in Phoenix >> today. The quantity of columns should become asymptotic to a few tends of >> thousands of columns as their data fills in. >> >> The user want to query all columns in a table and they are today thinking >> of using views to do this -- but it is ugly management. They have an >> unbounded number of views -- which will pollute the global catalog and fail >> relatively quickly. >> >> Has anyone thought about the potentially wasteful[1] approach of scanning >> all rows in a query to determine columns and then re-running the query for >> the rows once we know what columns the SQL result will contain. Maybe >> something cleaner like persisting the set of columns in the statistics >> table and a SELECT * may return columns with nothing but nulls. Or, even >> better is there an overall better way to model such a wide schema in >> Phoenix? >> >> -Clay >> >> [1]: Perhaps some heuristics could allow for not needing to do 2n reads >> in all cases? >> >