That's pretty slick, Steve. Another variant along the same lines would be to create/alter a view over a base table where you use the IF NOT EXISTS. In this way, you let Phoenix track the dynamic columns for you. We have a new feature in 4.8 for declaring a table as an APPEND_ONLY_SCHEMA and we'll minimize the RPCs for ensuring the meta data is up to date. It works well for dynamically tracking schema as data is processed, a pretty common pattern.
Thanks, James On Friday, June 3, 2016, Steve Terrell <sterr...@oculus360.us> wrote: > I have a similar situation. I have records with varying fields that I > wanted to access individually and also as a group. > > My actual records are JSON objects, so they look like like this: > {"field1": value1, "field2": value2, …} > To make matter harder, the fields are also varying types: ints, strings, > boolean, floats. > > What I ended up doing that has worked really well is make each field in > the JSON record a dynamic field in a Phoenix table. And, to preserve the > integrity of the original record, I have one static string field in which > the entire original JSON goes into. > > Because of the dynamic fields, I have to execute an UPSERT for each row > instead of some kind of batch loading. But, I routinely load millions of > rows like this, sometimes with 20+ fields, and it has never failed on an > upsert and runs fast enough. > > A few of the fields are actually always in each record, so I use those for > my primary keys, > > The end result is great! I can query by any field, quickly pull entire > records from the key fields, and any JSON object can be loaded. > > In my code that translates the JSON into UPSERT statements I also keep a > running tab of field names, their type, and number of occurrences. Then a > the end of my millions of UPSERTS I populate a simple table that just lists > all the fields/types/counts so that there is no mystery about what fields > are available when I want to query the big table. I did this because I > don't always know what field names are in the JSON, and I don't know how to > get a list of dynamic field names from Phoenix SQL. > > Hope this helps, > Steve > > On Thu, Jun 2, 2016 at 9:28 PM, Stephen Wilcoxon <wilco...@gmail.com > <javascript:_e(%7B%7D,'cvml','wilco...@gmail.com');>> wrote: > >> The simple solution is to transform the multi-valued field into a >> separate linked table with multiple rows per "id". If this is what you >> mean by "exploding the data", why do you not want to do so? >> >> On Thu, Jun 2, 2016 at 7:11 PM, Rahul Jain <rja...@gmail.com >> <javascript:_e(%7B%7D,'cvml','rja...@gmail.com');>> wrote: >> >>> >>> Folks, >>> >>> I used this question earlier on the wrong list; posting on apache >>> phoenix user group on the advise of James Taylor: >>> >>> We have a use case of migration from legacy database to hbase/phoenix. >>> I'd want to hear your thoughts on how to manage the given multi-valued >>> field: >>> >>> Field name : namesList >>> >>> Contains: comma separated list of strings. >>> >>> e.g. "Jack, Richard, Fred" >>> >>> The field does not have an inherent limit on the number of strings, >>> however for successful migration we need to support unto 100 strings in the >>> field >>> >>> This field is currently indexed in our legacy DB technology where the >>> user is able to query by any of the contained strings. >>> >>> e.g. query: >>> >>> select * from myTable where namesList LIKE '%Fred%'; >>> >>> Is there any way Phoenix supports generating multiple index values from >>> a single column in a row ? We are looking for a method where we don't >>> explicitly specify individual offsets for indexing, (nameList[0], >>> nameList[1] etc) , rather say a custom functional index equivalent >>> generates the indexes as a list and each of the index in the list is >>> persisted individually. >>> >>> Effectively our goal is to avoid full table scans when doing queries >>> that match values from multi valued fields such as example above. We do >>> want to avoid exploding the data however possible. >>> >>> Any comments / suggestions are welcome. >>> >>> Thank you ! >>> >>> Rahul >>> >> >> >